# NatureCommonAncestors Article

## Transcript

2 letters to nature ð : n D log Þ D þ R , Furthermore, if we let denote the diameter of T 2 n Box 1 , since the IA point the graph, then the number of generations, U Graph-theoretical definitions n Þ , ð : n D log þ 77 : 1 U satisfies n 2 The length of a path in a graph, , is the number of edges in the path. G Computer simulations accord with these theoretical predictions. ) is defined to be and i For each pair of nodes ( i d , the distance G in j j , and U for small populations Tables 1 and 2 give distributions of T n n is G the length of a shortest path joining i and j . The radius of of varying sizes in graphs with one node, three connected nodes, five fully connected nodes and for a ten-node graph loosely based on R ¼ ð min{ } Þ ; i k d max i [ G k [ G world geography as shown in Fig. 1. In these simulations, neigh- bouring subpopulations exchange one pair of migrants per genera- . Assume R ¼ ) k d ( i , if max G is called a centre of i and a node G [ k 2 tion. Each mean is calculated from 100 model runs. Although R \$ has one node) was treated previously G 0( 1; the case R ¼ . For n guaranteed to be accurate only for sufficiently large , the theoreti- each centre node i , let S be a set of minimal size that consists of i cal predictions describe the simulations quite well even for models # } R 2 1 for ): and satisfies min { neighbours of node i S < } i { [ j d k , j ( i is with just a few thousand individuals. Whenever n is doubled, T all G . H [ k is the minimum is defined as the number of nodes in S , H i i n 1 of H i is : G over all centres The diameter of , and D ¼ 1 2 , and is expected to increase by U D þ R expected to increase by i n H max D ¼ ( i , k d ). D 1.77. These predicted increases, which are listed in the last þ G i k [ , columns of Tables 1 and 2, agree closely with the simulation results. To hazard a rough first guess about human recent common ancestors, we could extrapolate the results for the graph of Fig. 1 to a exclusively maternal lines would have lived something like 50,000 growing population with a final size of 250 million. When applying in the order of one million generations ago. times earlier this model to a growing population, the fixed population size that — As genealogical ancestry is traced back beyond the MRCA, a provides the best approximation is the size at the time that the growing percentage of people in earlier generations are revealed to MRCA lived. We take this effective population size to be 250 million, be common ancestors of the present-day population. Tracing 1. AD which is approximately the global population in the year generations U further back in time, there was a threshold, let us say Starting from ¼ 16,000, a population of 250 million is reached n n ago, before which ancestry of the present-day population was an all U and T by doubling 14 times. Approximating the increases in n n or nothing affair. That is, each individual living at least U beyond the values seen in Tables 1 and 2 by their theoretical n generations ago was either a common ancestor of all of today’s < 34 þ 14 £ 3 ¼ T , we arrive at n predictions for each doubling of n humans or an ancestor of no human alive today. Thus, among all 14 þ 74 < 77 ¼ : 6 £ 169 76 generations (about 2,300 years) and U n generations ago, each present-day U individuals living at least generations (about 5,000 years). These estimates would suggest, n human has exactly the same set of ancestors. We refer to this point in with the exchange of just one pair of migrants per generation time as the identical ancestors (IA) point. As with the MRCA point, between large panmictic populations of realistic size, that the the IA point is also quite recent in a randomly mating population: , and all modern individuals BC MRCA appears in about the year 300 2 generations ago n 1.77 log , . U BC . Such estimates are have identical ancestors by about 3,000 2 n The major problem in applying these results to human popu- extremely tentative, and the model contains several obvious sources lations is that mating is not random in the real world. Mating of error, as it was motivated more by considerations of theoretical patterns are structured by geography, proximity, culture, language insight and tractability than by realism. Its main message is that and social class. Nevertheless, even in populations with considerable substantial forms of population subdivision can still be compatible internal structure, the time to the MRCA can be remarkably brief. with very recent common ancestors. To demonstrate this in a tractable mathematical model, consider a The dynamics of human subpopulations are much more complex divided into randomly mating subpopulations population of size n than those in the simple graph model discussed above. Although that are linked by occasional migrants. The population is rep- these complexities make theoretical analysis difficult, a computer G , with a node for each subpopulation. resented by a graph, model incorporating more complicated forms of population sub- Edges indicate pairs of nodes that exchange a small number (for structure and migration allows the demographic history of human denote example, one pair) of migrants per generation. Let R populations to be simulated. The Supplementary Information be a quantity ranging between 0 and 1 D G the radius of , and let contains more details on the model and computations; here we G that depends on the structure of (see Box 1). A probabilistic briefly outline some of the main points. analysis (see Supplementary Information) shows that as , !1 n This model is based on a simplified projection of the world’s Table 1 T Simulations of n ¼ ¼ 1,000 n ¼ 2,000 n ¼ Graph n n 8,000 n ¼ 16,000 R þ D 4,000 ... ... ... One node 10.8 (0.4) 11.8 (0.4) 12.8 (0.4) 13.9 (0.3) 14.8 (0.4) 1.00 Three fully connected nodes 14.0 (0.7) 17.1 (0.9) 18.9 (0.8) 20.3 (1.0) 1.50 15.6 (0.7) Five fully connected nodes 15.8 (0.5) 17.8 (0.5) 19.6 (0.5) 21.5 (0.6) 1.75 14.0 (0.5) Ten-node graph shown in Fig. 1 21.1 (1.3) 24.3 (1.5) 27.6 (1.5) 30.5 (1.5) 33.8 (1.7) 3.00 ... ... ... (the number of generations back to the MRCA) for graph-structured populations exchanging a single pair of migrants per edge per generation. The Means (standard deviations in parentheses) of T n last column shows R þ . , the expected asymptotic increase in per doubling of T D n n U Simulations of Table 2 n 8,000 1,000 n ¼ 2,000 n ¼ 4,000 Graph ¼ ¼ n ¼ 16,000 D þ 1.77 n n ... ... ... 20.8 (1.6) 22.6 (1.5) 24.6 (1.5) 1.77 One node 28.3 (1.4) 26.5 (1.6) 27.4 (1.5) Three fully connected nodes 33.4 (1.5) 36.2 (1.7) 38.9 (1.5) 2.77 30.3 (1.4) 25.9 (1.3) 32.1 (1.7) 35.3 (1.5) 37.9 (1.4) 2.77 Five fully connected nodes 28.9 (1.4) 46.3 (2.7) 53.0 (2.7) 59.8 (2.7) Ten-node graph shown in Fig. 1 73.6 (2.7) 6.77 66.8 (2.9) ... ... ... (the number of generations back to the IA point) for graph-structured populations exchanging a single pair of migrants per edge per generation. The Means (standard deviations in parentheses) of U n last column shows U 1.77, the expected asymptotic increase in þ D per doubling of . n n 563 NATURE | VOL 431 | 30 SEPTEMBER 2004 | www.nature.com/nature Nature 4 200 © Publishing Group

3 letters to nature Figure 1 World map viewed as a ten-node graph. This graph has radius 3 and diameter 5. actual inhabited land masses and has three levels of substructure: port come from the country in which the port is located, with the continents, ‘countries’ and ‘towns.’ Figure 2 depicts the model’s remainder drawn from other countries in the continent in pro- AD 1500, with the geography and migration routes used before portion to their inverse squared distance. The value next to a port in countries shown as squares and the number of towns per country Fig. 2 is its migration rate, in people per generation, and the date in differing from continent to continent. Towns and countries rep- parentheses indicates when the port opens, if it is more recent than . When a port opens, there is BC resent both the local geographical areas and the relevant social and the start of the simulation in 20,000 usually a single generation of migration at a higher rate than the ethnic groups from which most people find mates. AD 1500, steady-state rate shown in the figure. After the year The model uses a simplified migration system in which each additional large ports, which are not shown, begin to open to person has a single opportunity to migrate from his or her town of simulate colonization of the Americas, Australia and elsewhere. birth. The probabilities of leaving a town or a country are set at Immediately before this, the native population of the Americas is various levels to reflect different migration patterns. Migrants who markedly reduced to simulate the effects of European-introduced move between towns can travel to any other town within the 7 . diseases country. A migrant who leaves a country for another country within Generations overlap in this model and we explicitly simulated the the same continent chooses the destination with a probability that lifespan and the times at which mating and reproduction events diminishes as the inverse square of the geographical distance. 8,9 , as described in more detail in Sup- occur for each individual Each continent has a number of port countries from which plementary Information. The birth rate of each continent or island migrants can travel to another continent. A fixed, large percentage was individually adjusted so that the populations match historical (for example, 95% in some simulations) of the migrants through a Geography and migration routes of the simulated model. Arrows denote ports given, the date in parentheses indicates when the port opens. Upon opening, there is Figure 2 and the adjacent numbers are their steady migration rates, in individuals per generation. If usually a first-wave migration burst at a higher rate, lasting one generation. NATURE | VOL 431 | 30 SEPTEMBER 2004 | www.nature.com/nature 564 Nature 4 200 © Publishing Group

