I’ve only written two, peer-reviewed scientific articles, but both were great adventures. I’d love to do more original scientific research, but I doubt the National Science Foundation will give a grant to someone who’s never spent a single day in graduate school.
The first and by far more important article was:
“Modelling the Recent Common Ancestry of All Living Humans” by Douglas L. T. Rhode, Steve Olson, and Joseph T. Chang, Nature 431(2004):562-566
The idea at the heart of that paper occurred to me when I was writing my book Mapping Human History: Genes, Race, and Our Common Origins. While working on one of the chapters, I asked myself the following question: If ancestors are considered in conventional terms — as a person’s parents, grandparents, great-grandparents, and so on — who was the most recent common ancestor (MCRA) of all living humans? In other words, if everyone knew exactly who their direct ancestors were and could draw a tree of those ancestors extending indefinitely into the past, who is the first person going back in time who is an ancestor of everyone living today?
I soon discovered that the question had never been answered before, and for a good reason: it’s a very hard question. It depends on the detailed migration, mating, and survival patterns of humans over hundreds and thousands of years.
But someone had done work on a much simpler version of the question. A professor of statistics at Yale, Joe Chang, published a paper in 1999 proving that the number of generations back to the MRCA in a randomly mating population is surprisingly small. (Here’s how small, for those who remember a little high school mathematics.)
The problem is that human populations don’t mate randomly. Random mating would mean that you or I have an equal chance of marrying and having children with any other person in the world. But we’re much more likely to have children with people we know, who live near us, who speak the same language as us, or are from the same social class as us. The nonrandom nature of human mating patterns is what makes the problem so hard.
I called Joe on the phone and we talked about how you might solve the more general problem — he had already started thinking about the complications. Meanwhile, I began reading some of the work that was just then coming out on what are called small world networks. A small world network typically consists of clusters of objects (whether points on a paper, networked computers, interacting proteins, or whatever) that share many interconnections. These clusters are in turn linked to other clusters by occasional interconnections that occur more or less at random. The remarkable thing about a small world network is that even when the clusters are loosely connected, the network as a whole can exhibit behavior much like that of a single tightly connected cluster.
One day, while flying back to Washington, D.C., from California, it suddenly occurred to me that human mating patterns form excellent small world networks. In that case, the MRCA of the human population shouldn’t have lived much earlier than the MRCA of a similarly sized population that mated randomly.
I was sufficiently confident in my conclusion to publish it as a conjecture in my book. I wrote that the most recent common ancestor of all living humans must have lived just a few thousand years ago, and that most if not all of the world’s population must therefore be descended from prominent historical figures who lived in the first millennium BC. But when the book came out, several geneticists wrote in reviews that my ideas about ancestry couldn’t possibly be right. So I started talking with Joe again about writing a paper to prove the point. Meanwhile, a computer scientist at MIT named Doug Rohde, who was working on neural networks, heard me talking about our common ancestry on a radio show. He, too, immediately assumed that the idea was daft, but he realized that he could build a computer model that would simulate the mating of realistically sized human populations over thousands of years. When Joe and I heard that he was working on the problem, we invited him to join in our collaboration.
Doing research with Joe and Doug for a year or two was a fascinating experience. For one thing, I got to see how a scientific collaboration works. I don’t think any of us could have published a paper on this subject — especially in Nature — working by ourselves. The paper wouldn’t have worked without the unique contributions each of us made to it. I also got to see aspects of scientific publishing that I’d heard about but never experienced before: the capriciousness of who you get as reviewers and what those reviewers say, the influence a good editor has on a paper’s progress, the churning of the publicity machine at a major journal, the challenge of getting a paper noticed once it’s published.
The other peer-reviewed paper was:
“The Use of Racial, Ethnic, and Ancestral Categories in Human Genetics Research,” American Journal of Human Genetics, Vol. 77, No. 4, October 2005
The story behind this paper is much more prosaic. In 2004 I began working on a white paper for the National Human Genetics Research Institute at the National Institutes of Health on how considerations of race and ethnicity influence genetics research. The paper served as background reading for a meeting that fall, and after the meeting Francis Collins, the director of NHGRI, suggested that the paper that the paper be revised and submitted to the American Journal of Human Genetics. After a long series of reviews and revisions, AJHG published the paper in October 2005. The paper’s a pretty good review of the historical, cultural, and biological factors that influence how we think about race and ethnicity.