Genome Research 2015 - Karmin
Genome Research 2015 - Karmin
org on March 16, 2015 - Published by Cold Spring Harbor Laboratory Press
Research
It is commonly thought that human genetic diversity in non-African populations was shaped primarily by an out-of-Africa
dispersal 50–100 thousand yr ago (kya). Here, we present a study of 456 geographically diverse high-coverage Y chromo-
some sequences, including 299 newly reported samples. Applying ancient DNA calibration, we date the Y-chromosomal
25:1–8 Published by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/15; www.genome.org Genome Research 1
www.genome.org
Downloaded from genome.cshlp.org on March 16, 2015 - Published by Cold Spring Harbor Laboratory Press
Karmin et al.
most recent common ancestor (MRCA) in Africa at 254 (95% CI 192–307) kya and detect a cluster of major non-African
founder haplogroups in a narrow time interval at 47–52 kya, consistent with a rapid initial colonization model of Eurasia and
Oceania after the out-of-Africa bottleneck. In contrast to demographic reconstructions based on mtDNA, we infer a second
strong bottleneck in Y-chromosome lineages dating to the last 10 ky. We hypothesize that this bottleneck is caused by cul-
tural changes affecting variance of reproductive success among males.
[Supplemental material is available for this article.]
Despite the higher per-base-mutation rate of mtDNA, the much cent times and order of haplogroup splits (Supplemental Informa-
greater length of the Y chromosome (Chr Y) offers the highest ge- tion 3,4), and we use simulations (Supplemental Information
nealogical resolution of all non-recombining loci in the human ge- 5) to test the scenarios that can explain the observed patterns
nome. Previous studies have established a standard Y chromosome in the mtDNA and Y chromosome data for a subset of 320
haplogroup nomenclature based on resequencing of limited tracts individuals.
of the locus in small numbers of geographically diverse samples In labeling Y chromosome haplogroups, we follow the princi-
(The Y Chromosome Consortium 2002; Karafet et al. 2008; van ples and rules set out by the Y Chromosome Consortium (YCC)
Oven et al. 2013). As a result, the precise order and timing of the (The Y Chromosome Consortium 2002). As we introduce a large
phylogenetic splits has only recently started to emerge from whole number of new whole Chr Y sequences that substantially increase
Y chromosome sequences (Francalacci et al. 2013; Mendez et al. the resolution of the internal branches of the Chr Y tree, we try
2013; Poznik et al. 2013; Wei et al. 2013; Lippold et al. 2014; to both incorporate the new information and to maintain the
Scozzari et al. 2014; Yan et al. 2014; Hallast et al. 2015). While integrity and historical coherence of the initial YCC haplogroup
the male to female effective population size ratio has been estimat- nomenclature as introduced in 2002 and its updates (Jobling
ed as being below one throughout much of human evolutionary and Tyler-Smith 2003; Karafet et al. 2008). We use an approach
history (Lippold et al. 2014), the factors affecting its dynamics similar to the concise reference phylogeny proposed by van
are still poorly understood. Here, we combine 299 new whole Y Oven et al. (2014) with minor modifications that are aimed to
chromosome high-coverage sequences from 110 populations make the haplogroup nomenclature more amenable to the in-
with similar publicly available data (Fig. 1; Supplemental Table corporation of novel haplotypes than it is now (Supplemental
S1; Methods). We use these 456 sequences to estimate the coales- Information 6).
250000
A00
150000
A2
BT
125000
100000
Africa
Near−East*
DT
Europe
B2’5
South−Asia
Central−Asia
75000
DE
CT
Andes
C
D
HT
IT
NR
MR
50000
P
H1’3
IJ
LT
MS
NO
C3
P1
I
J
Q
G2a
R
N
25000
R1
R1b
R1b1’13
Q1a’c
II
R2
R1a1’3
III
IV
0
Figure 1. The phylogenetic tree of 456 whole Y chromosome sequences and a map of sampling locations. The phylogenetic tree is reconstructed using
BEAST. Clades coalescing within 10% of the overall depth of the tree have been collapsed. Only main haplogroup labels are shown (details are provided in
Supplemental Information 6). Colors indicate geographic origin of samples (Supplemental Table S1), and fill proportions of the collapsed clades represent
the proportion of samples from a given region. Asterisk (∗ ) marks the inclusion of samples from Caucasus area. Personal Genomes Project (http://www.
personalgenomes.org) samples of unknown and mixed geographic/ethnic origin are shown in black. The proposed structure of Y chromosome hap-
logroup naming (Supplemental Table S5) is given in Roman numbers on the y-axis.
2 Genome Research
www.genome.org
Downloaded from genome.cshlp.org on March 16, 2015 - Published by Cold Spring Harbor Laboratory Press
Results (Supplemental Fig. S21). We show that only the F1329 SNP
(Supplemental Fig. S13) first separates the deep F and GT branches
Using standard and custom filters (Supplemental Information 2; and corroborate the succeeding swift split of G from HT by the sin-
Supplemental Table S2), we first identified reliable regions on gle M578 SNP (Poznik et al. 2013). Similarly, all other subsequent
the Chr Y and retained 8.8 Mb of sequence per individual. A total inner branches (IT, K, NR, MR, P), common throughout non-
of 35,700 SNPs had a call rate higher than 95% and were sub- African populations, are short and consistent with a rapid diversi-
sequently used in phylogenetic analyses and for estimation of fication of the basic Eurasian and Oceanian founder lineages at
coalescence times. Data quality assessment by evaluating SNP around 50 kya (Supplemental Fig. S9; Bowler et al. 2003; Higham
differences between father-son pairs resulted in an average of ap- et al. 2014). Within the Y chromosome haplogroups common in
proximately one mutation per pair, indicating a low false-positive Eurasian populations, we noticed that many coalesce within the
rate, and only 588 recurrent sites (1.6%) observed in the filtered last 15 ky (Fig. 1), i.e., corresponding to climate improvement after
data. Combining independent evidence from two ancient DNA se- the Last Glacial Maximum, and a cluster (Supplemental Table S7;
quences, we estimated the mutation rate of Y chromosome binary Supplemental Fig. S11) of novel region-specific clades (Supple-
SNPs in the filtered regions at 0.74 × 10−9 (95% CI 0.63–0.95 × mental Information 6) with coalescence times within the last
10−9) per base pair (bp) per yr (Supplemental Information 3). It 4–8 ky. Regional representations of pairwise divergence times of
should be noted that this estimate is based on only two ancient Y chromosomes also revealed clustering of coalescence events
DNA samples from a relatively recent time horizon and the same consistent with the peopling of the Americas at around 15 kya
Y chromosome haplogroup. However, a very similar mutation (Supplemental Fig. S12).
rate estimate of 0.76 × 10−9 per bp per yr was determined indepen- We used Bayesian skyline plots (BSP) to infer temporal chang-
dently from a different ancient DNA specimen of much older age es of regional male and female effective population sizes (Ne)
by a recent study (Fu et al. 2014). (Supplemental Fig. S4A). The cumulative global BSP of 320 Y chro-
We uncovered new phylogenetic structure and reappraised mosomes with known geographic affiliation and the plot inferred
haplogroup definitions and their branch lengths in the global from mtDNA sequences from the same individuals both showed
phylogeny (Fig. 1; Supplemental Fig. S3). We also generated two increases in the Ne at ∼40–60 kya (Fig. 2). However, the two
Illumina high-coverage sequences of African haplogroup A00 plots differed in a number of important features. Firstly, the Ne es-
(Mendez et al. 2013) to root the phylogeny and to determine the timates based on mtDNA are consistently more than twice as
ancestral versus derived states of the variable sites (Supplemental high as those based on the Y chromosome (Supplemental Fig.
Table S8). We estimated the age of the split between A00 and the S6). Secondly, both mtDNA and Y plots (Supplemental Fig. S4)
rest at 254 thousand yr ago (kya) (95% CI 192–307 kya; Supple- showed an increase of Ne in the Holocene, which has been docu-
mental Table S7). Comparing chimpanzee and A00 outgroup in- mented before for the female Ne (Gignoux et al. 2011). However,
formation across the 652 positions separating haplogroups A2′ 5 the Y chromosome plot suggested a reduction at around 8–4 kya
and BT (Supplemental Fig. S13) revealed inconsistency at 4.6% (Supplemental Fig. S4B; Supplemental Table S4) when the female
sites. The observed number of discordant calls was significantly Ne is up to 17-fold higher than the male Ne (Supplemental Fig. S5).
higher than the 1%–2% discordance rate predicted from phyloge-
netic divergence between human and chimpanzee genomes (The
Chimpanzee Sequencing and Analysis Consortium 2005) and like-
Discussion
ly reflects the uncertainties in mapping cross-species reads to the The estimated time line of the Y chromosome coalescent events in
same reference sequence. non-African populations (Supplemental Fig. S9) fits well with ar-
In anticipation of ever larger numbers of whole sequences, we chaeological evidence for the dates of colonization of Eurasia
simplified the Y chromosome haplogroup nomenclature (Supple- and Australia by anatomically modern humans as a single wave
mental Information 6) for all clades by using the “join” rule (The ∼50 kya (Bowler et al. 2003; Mellars et al. 2013; Higham et al.
Y Chromosome Consortium 2002) and classified them relative 2014; Lippold et al. 2014). However, considering the fact that
to four coalescent horizons (Fig. 1; Supplemental Table S5). We the Y chromosome is essentially a single genetic locus with an
used high-coverage whole-genome sequence data from this and extremely low Ne, estimated <100 at the time of the out-of-Africa
previous studies to define the layout of the basic A and B subclades dispersal (Lippold et al. 2014), these results cannot refute the alter-
(Supplemental Figs. S14, S15). We found 236 markers that separate native models suggesting earlier Middle Pleistocene dispersals
haplogroups restricted to African populations (A and B) from the (100–130 kya) from Africa along the southern route (Armitage
rest of the phylogeny (Supplemental Fig. S13). Notably, we detect- et al. 2011; Reyes-Centeno et al. 2014). The evidence for these early
ed a >15-ky gap between the separation of African and non-African dispersals could potentially be embedded only in the autosomal
lineages at 68–72 (95% CI 52–87) kya and the short interval at 47– genome.
52 (95% CI 36–62) kya when non-African lineages differentiate The surprisingly low estimates of the male Ne might be ex-
into higher level haplogroups common in Eurasian, American, plained either by natural selection affecting the Y chromosome
and Oceanian populations (Supplemental Table S7; Supplemental or by culturally driven sex-specific changes in variance in offspring
Fig. S9). This gap would be even more pronounced (52–121 kya) if number. As the drop of male to female Ne does not seem to be lim-
extant Asian D and African E distributions could be explained by ited to a single or a few haplotypes (Supplemental Fig. S3), selec-
an early back-migration of ancestral DE lineages to Africa (Hammer tion is not a likely explanation. However, the drop of the male
et al. 1998). Ne during the mid-Holocene corresponds to a change in the ar-
In the non-African haplogroups C and F, we identified a num- chaeological record characterized by the spread of Neolithic cul-
ber of novel features. We report that C now bifurcates into C3 tures, demographic changes, as well as shifts in social behavior
(Supplemental Fig. S20) and another clade containing all the other (Barker 2006). The temporal sequence of the male Ne decline
C lineages including two new highly divergent subclades detected patterns among continental regions (Supplemental Fig. S4B) is
in our Island Southeast Asian samples that we call C7 and C9 consistent with the archaeological evidence for the earlier spread
Genome Research 3
www.genome.org
Downloaded from genome.cshlp.org on March 16, 2015 - Published by Cold Spring Harbor Laboratory Press
Karmin et al.
of farming in the Near East, East Asia, and South Asia than in be male-specific, for example, if competition is through a male-
Europe (Fuller 2003; Bellwood 2005). A change in social structures driven conquest. A historical example might be the Mongol
that increased male variance in offspring number may explain the expansions (Zerjal et al. 2003). Innovations in transportation tech-
results, especially if male reproductive success was at least partially nology (e.g., the invention of the wheel, horse and camel domes-
culturally inherited (Heyer et al. 2005). tication, and open water sailing) might have contributed to this
Changes in population structure can also drastically affect the pattern. Likely, the effect we observe is due to a combination of
Ne. In simple models of population structure, with no competition culturally driven increased male variance in offspring number
among demes, structure will always increase the Ne. However, within demes and an increased male-specific variance among
structure combined with an unbalanced sampling strategy can demes, perhaps enhanced by increased sex-biased migration pat-
lead BSP to infer false signals of population decline under a cons- terns (Destro-Bisol et al. 2004; Skoglund et al. 2014) and male-spe-
tant population size model (Heller et al. 2013). An increase in cific cultural inheritance of fitness.
male migration rate might reduce the male Ne but is unlikely to We note that any nonselective explanation for the reduction
cause a brief drastic reduction in Ne as observed in our empirical in Ne would also predict a reduction of the Ne at autosomal loci in
data. Similarly, simple models of increased or decreased popula- this short time interval (Supplemental Fig. S6). In fact, when the
tion structure are not sufficient to explain the observed patterns sex difference in Ne is large, the autosomal effective population
(Supplemental Information 5; Supplemental Fig. S7). However, size should be dominated by the sex with the lowest effective pop-
in models with competition among demes, an increased level ulation size. However, most existing methods are underpowered to
of variance in expected offspring number among demes can dras- detect Ne changes within the past few thousand years (i.e., relative-
tically decrease the Ne (Whitlock and Barton 1997). The effect may ly short-lived demographic events) from recombining genome-
Y chr MtDNA
150
500
Effective Population Size (thousands)
Region
400
Africa
Andes 100
Central Asia 300
Europe
Near-East & Caucasus
Southeast & 200
East Asia
50
Siberia
South Asia
100
0 0
100 50 10 0 100 50 10 0
Thousands of Years Ago Thousands of Years Ago
Figure 2. Cumulative Bayesian skyline plots of Y chromosome and mtDNA diversity by world regions. The red dashed lines highlight the horizons of 10
kya and 50 kya. Individual plots for each region are presented in Supplemental Figure S4A.
4 Genome Research
www.genome.org
Downloaded from genome.cshlp.org on March 16, 2015 - Published by Cold Spring Harbor Laboratory Press
wide sequence data, resulting in limited evidence either for or of the X chromosome (Chr X) to highlight the deviation of local
against such patterns in autosomal data. A recent study using a sequence coverage from the expected mean; (3) regional exclusion
newly developed approach reported variable growth patterns in mask, where we exclude all of Chr Y outside 10.8-Mb sequence
the window of 2–10 ky among global populations and some evi- mostly overlapping with X-degenerate regions shown to yield reli-
dence of a reduction in the Ne in local populations (Schiffels able next generation sequencing (NGS) data; and (4) re-mapping
and Durbin 2014). Finally, the inferred mid-Holocene Ne dips filter, where we modeled poorly mapping regions on Chr Y
may represent a genuine population collapse following the intro- and identified those that also map to sequence data derived
duction of farming, as has been recently shown for Western from female individuals (Supplemental Table S2; Supplemental
Information 2).
Europe using summed radiocarbon date density through time
(Shennan et al. 2013).
The male-specific effective population size changes reported Y chromosome mutation rate and haplogroup age estimation
here highlight the potential of whole Y chromosome sequencing In order to minimize the effects of NGS differences and autoso-
to improve our understanding of the demographic history of mal versus sex chromosome specifics on mutation rate calibra-
populations. Further insights into the causes of such sex-specific tion, and to avoid the need to make assumptions about the
patterns will benefit from population-scale Y chromosome data extent of genetic variation in relation to archaeological evidence,
from ancient DNA studies and their interpretation in an interdisci- we calibrated the Chr Y mutation rate in our CG data by using
plinary framework including also archaeological and paleoclimatic inferences of the coalescent times of two Chr Y haplogroups, Q1
evidence and integrative spatially explicit simulations. and Q2b, from ancient DNA data. We used Chr Y data of
the 12.6-ky-old Anzick (Q1b) and 4-ky-old Saqqaq (Q2b) speci-
mens (Rasmussen et al. 2010, 2014). In both cases, we used only
Methods transversion polymorphisms and the approach described in
Rasmussen et al. (2014). For the calculations of Chr Y haplogroup
Samples and sequencing coalescent times and BSP analyses, we combined the two ancient
Following informed consent donor permission and authorization DNA-based mutation rate estimates using weights proportional
by local ethics committees, saliva or blood samples were collected to the product of age and coverage of both ancient DNA samples,
from 299 unrelated male individuals from 110 populations, of yielding the final estimate of 0.74 × 10−9 (95% CI 0.63–0.95 ×
which 16 are released under the accession number PRJEB7258 10−9) per bp per yr (Supplemental Information 3). The coalescent
(Fig. 1; Supplemental Table S1; Clemente et al. 2014). For quality ages of Chr Y haplogroups were estimated using two method-
checks, we used additional data from 10 Estonian first-degree rela- ologies: Bayesian inference applied on sequence data (SI4) and
tives, 24 Dutch father-son pairs, and four duplicate samples. using short tandem repeat (STR) data. The STR base age estimates
Sequencing of the whole genome was performed at Complete were drawn using the method developed by Zhivotovsky et al.
Genomics (Mountain View, California) at standard (>40×) cover- (2004) and modified by Sengupta et al. (2006) (Supplemental
age for blood- and high coverage (>80×) for saliva-based DNA sam- Information 3).
ples; the Dutch father-son pairs were blood samples sequenced at
(>80×) coverage. Y chromosome (Chr Y) data from X-degenerate Phylogenetic analyses
nonrecombining regions was extracted using cgatools and ana-
Summary statistics, such as nucleotide diversity, mean pairwise
lyzed in combination with publicly available data (Drmanac
differences, and AMOVA, were computed in Arlequin v3.5.1.3
et al. 2010; Lachance et al. 2012) and the Personal Genomes
(Excoffier and Lischer 2010). We used software package BEAST
Project.
v1.8.0 (Drummond et al. 2012) to reconstruct phylogenetic trees,
Independently, for the purpose of rooting the Chr Y tree with
estimate coalescent ages of haplogroups, and sex-specific effective
the oldest known clade, we sequenced the whole genomes from
population sizes. The general time reversible (GTR) substitution
the buccal swabs of two individuals from the Mbo population,
model was selected by jModelTest (Darriba et al. 2012) as the
with the prior knowledge of their haplogroup being A00. This in-
best fit for the Chr Y data and the HKY + I + G for the mitochondri-
formation was based on STR profiles and SNP genotyping (Mendez
al genomes. In order to reduce the computational load, the Chr Y
et al. 2013). Sequencing was performed on the Illumina HiSeq
BEAST analysis only contained the variable positions. However,
2000 machines at the Genomic Research Center, Gene by Gene,
the BEAST input XML file was modified by adding a parameter
Houston, Texas, at 30× aimed coverage. We used BWA 0.5.9 (Li
under the “patterns” section that specifies the nucleotide com-
and Durbin 2009) to map the paired-end reads to the GRCh37 hu-
position at invariable sites. For the eight geographically explicit
man reference sequence, removing PCR duplicates with SAMtools
regions (Supplemental Table S1), we generated BSPs for both Chr
0.1.19 rmdup command (Li et al. 2009), and then calling Chr Y ge-
Y and mtDNA data (Supplemental Fig. S4A). The BSPs for Chr
notypes with SAMtools mpileup and BCFtools (Li et al. 2009), re-
Y and mtDNA were plotted together in R (R Core Team 2012) using
sulting in the average coverage for Chr Y of the two individuals
the package ggplot2 (Wickham 2009). To test for significant devi-
12.7× and 17.2×, respectively.
ations in diversification rates along the branches of Y chromosome
tree, we used SymmeTree 1.1 (Supplemental Information 4; Chan
Filtering the sequence data and Moore 2005).
We filtered the variant sites by the quality scores provided by
Complete Genomics and kept only high-quality biallelic SNPs. Simulations
We developed several additional filters to improve the quality FastSimCoal2 simulations of 500,000 sites of Chr Y and 16,569
of the resulting data set. Altogether, we tested four filters (Supple- sites of mtDNA were performed using mutation rates specified in
mental Table S2): (1) >5× unique sequence coverage filter, where SI3 and starting population size 10,000. Coalescent times of all
regions with <5× unique coverage on Chr Y were removed; (2) X nodes in the resulting trees were estimated under a constant size
chromosome normalized coverage filter, where we tracked the and exponential growth models. The growth model assumed cons-
fluctuations of relative unique coverage (UC) normalized to that tant size until 400 generations, followed by exponential growth
Genome Research 5
www.genome.org
Downloaded from genome.cshlp.org on March 16, 2015 - Published by Cold Spring Harbor Laboratory Press
Karmin et al.
(Keinan and Clark 2012). For each model, we plotted the histo- damental Medicine and Biology, Kazan Federal University, Kazan,
gram of coalescent times of all the nodes of the simulated trees Russia; 20Department of Genetics, University of Pennsylvania,
by six different deme formation scenarios: 1: no deme structure; Philadelphia, Pennsylvania, USA; 21School of Biology, Georgia In-
2: formation of 10 demes 400 generations ago; 3–6: formation of stitute of Technology, Atlanta, Georgia, USA; 22Department of
25, 50, 75, and 100 demes 400 generations ago, respectively (Sup- Biology, University of Pennsylvania, Philadelphia, Pennsylvania,
plemental Information 5). USA; 23DNcode Laboratories, Moscow, Russia; 24Evolutionary
Medicine Group, Laboratoire d’Anthropologie Moléculaire et
Nomenclature Imagerie de Synthèse, Centre National de la Recherche Scientifi-
que, Université de Toulouse 3, Toulouse, France; 25Eijkman Insti-
In labeling Chr Y haplogroups, we follow the principles and rules
set out by The Y Chromosome Consortium (2002). We try to both tute for Molecular Biology, Jakarta, Indonesia; 26Statistics and
incorporate our new information and to maintain the integrity Bioinformatics Group, Institute of Fundamental Sciences, Massey
and historical coherence of the initial YCC haplogroup nomen- University, Palmerston North, New Zealand; 27Centre for Ad-
clature as introduced in 2002 and its updates (Jobling and Tyler- vanced Research in Sciences (CARS), DNA Sequencing Research
Smith 2003; Karafet et al. 2008). We use an approach similar to Laboratory, University of Dhaka, Dhaka, Bangladesh; 28Arctic Re-
the concise reference phylogeny proposed by van Oven et al. search Centre, Aarhus University, Aarhus, DK-8000, Denmark;
29
(2014) with minor modifications that are aimed at making the Environmental Futures Research Institute, Griffith University,
Chr Y haplogroup nomenclature more amenable to the incorpora- Nathan, Australia; 30Genos, DNA Laboratory, Zagreb, Croatia;
31
tion of novel haplotypes than it is now. We propose to simplify the University of Osijek, Medical School, Osijek, Croatia; 32Cento-
Chr Y haplogroup nomenclature by defining a limited number of gene AG, Rostock, Germany; 33Institute of Bioorganic Chemistry,
levels of alphanumeric depth to be used in the haplogroup names, Academy of Science, Tashkent, 100143, Uzbekistan; 34Institute of
using the apostrophe symbol (’) to denote the “joined” names of Cytology and Genetics, Novosibirsk, Russia; 35Institute of Molecu-
related haplogroups at depths greater than Level I (Supplemental lar Biology and Medicine, Bishkek, Kyrgyzstan; 36Kuban State
Table S5; Supplemental Information 6). Medical University, Krasnodar, Russia; 37L. N. Gumilyov Eurasian
National University, Astana, Kazakhstan; 38Center for Life Scienc-
es, Nazarbayev University, Astana, Kazakhstan; 39Department of
Data access Molecular Genetics, Yakut Scientific Centre of Complex Medical
The raw read data on the Y chromosomes extracted from the whole Problems, Yakutsk, Russia; 40Laboratory of Molecular Biology,
genome sequences from this study have been submitted to the Institute of Natural Sciences, M. K. Ammosov North-Eastern Fede-
European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena/) ral University, Yakutsk, Russia; 41Mongolian Academy of Medical
under accession number PRJEB8108. The data are also available Sciences, Ulaanbaatar, Mongolia; 42National Cancer Centre Singa-
at the data repository of the Estonian Biocentre www.ebc.ee/ pore, Singapore; 43Northern State Medical University, Arkhan-
free_data/chrY. gelsk, Russia; 44Anthony Nolan, London, UK; 45Department of
Anthropology, University College London, London, United King-
dom; 46RIPAS Hospital, Bandar Seri Begawan, Brunei; 47Scientific-
List of Affiliations Research Center of the Caucasian Ethnic Groups, St. Andrews
1
Estonian Biocentre, Tartu, 51010, Estonia; 2Department of Georgian University, Tbilisi, Georgia; 48St. Catherine Specialty
Evolutionary Biology, Institute of Molecular and Cell Biology, Hospital, Zabok, Croatia; 49Eberly College of Science, Pennsylva-
University of Tartu, Tartu, 51010, Estonia; 3Department of nia State University, University Park, Pennsylvania, USA; 50Univer-
Botany, Institute of Ecology and Earth Sciences, University of sity of Split, Medical School, Split, Croatia; 51V. N. Karazin Kharkiv
Tartu, Tartu, 51010, Estonia; 4Division of Biological Anthropolo- National University, Kharkiv, Ukraine; 52Department of Genetics
gy, University of Cambridge, Cambridge, United Kingdom; and Bioengineering, Faculty of Engineering and Information
5
Department of Integrative Biology, University of California Berke- Technologies, International Burch University, Sarajevo, Bosnia
ley, Berkeley, California, USA; 6School of Life Sciences and The Bio- and Herzegovina; 53Institute of Genetics and Cytology, National
design Institute, Tempe, Arizona, USA; 7Department of Academy of Sciences, Minsk, Belarus; 54Department of Human Ge-
Bioinformatics, Institute of Molecular and Cell Biology, University netics, Radboud University Medical Center, Nijmegen, Nether-
of Tartu, Tartu, 51010, Estonia; 8Estonian Genome Center, Univer- lands; 55Research Centre for Medical Genetics, Russian Academy
sity of Tartu, Tartu, 51010, Estonia; 9Department of Biotechnol- of Sciences, Moscow, Russia; 56Genetics Laboratory, Institute of Bi-
ogy, Institute of Molecular and Cell Biology, University of Tartu, ological Problems of the North, Russian Academy of Sciences, Ma-
Tartu, 51010, Estonia; 10Laboratory of Ethnogenomics, Institute gadan, Russia; 57Department of Zoology, University of Cambridge,
of Molecular Biology, National Academy of Sciences, Yerevan, Ar- Cambridge, United Kingdom; 58Integrative Systems Biology Lab,
menia; 11Institute of Biochemistry and Genetics, Ufa Scientific King Abdullah University of Science and Technology, Thuwal,
Center of the Russian Academy of Sciences, Ufa, Russia; 12Depart- Saudi Arabia; 59Department of Genetics, Stanford University
ment of Psychology, University of Auckland, Auckland, 1142, New School of Medicine, Stanford, California, USA; 60ARL Division of
Zealand; 13Department of Biology, Pennsylvania State University, Biotechnology, University of Arizona, Tucson, Arizona, USA;
University Park, Pennsylvania, USA; 14Department of Applied So- 61
Department of Ecology and Evolution, Stony Brook University,
cial Sciences, University of Winchester, Winchester, United King- Stony Brook, New York, USA; 62The Henry Stewart Group, London,
dom; 15The Wellcome Trust Sanger Institute, Hinxton, United United Kingdom; 63Vavilov Institute for General Genetics, Russian
Kingdom; 16Center of Molecular Diagnosis and Genetic Research, Academy of Sciences, Moscow, Russia; 64University Hospital of
University Hospital of Obstetrics and Gynecology, Tirana, Albania; North Norway, Tromsøe, Norway; 65Research Department of Ge-
17
Center for GeoGenetics, University of Copenhagen, Copenha- netics, Evolution and Environment, University College London,
gen, Denmark; 18Department of Genetics and Fundamental London, United Kingdom; 66Estonian Academy of Sciences, Tal-
Medicine, Bashkir State University, Ufa, Russia; 19Institute of Fun- linn, Estonia
6 Genome Research
www.genome.org
Downloaded from genome.cshlp.org on March 16, 2015 - Published by Cold Spring Harbor Laboratory Press
Genome Research 7
www.genome.org
Downloaded from genome.cshlp.org on March 16, 2015 - Published by Cold Spring Harbor Laboratory Press
Karmin et al.
Rasmussen M, Li Y, Lindgreen S, Pedersen JS, Albrechtsen A, Moltke and admixture differs for Stone-Age Scandinavian foragers and farmers.
I, Metspalu M, Metspalu E, Kivisild T, Gupta R, et al. 2010. Ancient Science 344: 747–750.
human genome sequence of an extinct Palaeo-Eskimo. Nature 463: van Oven M, Toscani K, van den Tempel N, Ralf A, Kayser M. 2013.
757–762. Multiplex genotyping assays for fine-resolution subtyping of the major
Rasmussen M, Anzick SL, Waters MR, Skoglund P, DeGiorgio M, Stafford human Y-chromosome haplogroups E, G, I, J, and R in anthropological,
TWJr, Rasmussen S, Moltke I, Albrechtsen A, Doyle SM, et al. 2014. genealogical, and forensic investigations. Electrophoresis 34: 3029–3038.
The genome of a Late Pleistocene human from a Clovis burial site in van Oven M, Van Geystelen A, Kayser M, Decorte R, Larmuseau MH. 2014.
western Montana. Nature 506: 225–229. Seeing the wood for the trees: a minimal reference phylogeny for the hu-
R Core Team. 2012. R: A language and environment for statistical computing. R man Y chromosome. Hum Mutat 35: 187–191.
Foundation for Statistical Computing, Vienna, Austria. http://www.R- Wei W, Ayub Q, Chen Y, McCarthy S, Hou Y, Carbone I, Xue Y, Tyler-Smith
project.org/. C. 2013. A calibrated human Y-chromosomal phylogeny based on rese-
Reyes-Centeno H, Ghirotto S, Detroit F, Grimaud-Herve D, Barbujani G, quencing. Genome Res 23: 388–395.
Harvati K. 2014. Genomic and cranial phenotype data support multiple Whitlock MC, Barton NH. 1997. The effective size of a subdivided popula-
modern human dispersals from Africa and a southern route into Asia. tion. Genetics 146: 427–441.
Proc Natl Acad Sci 111: 7248–7253. Wickham H. 2009. Ggplot2 : elegant graphics for data analysis. Springer,
Schiffels S, Durbin R. 2014. Inferring human population size and sep- New York.
aration history from multiple genome sequences. Nat Genet 46: 919– The Y Chromosome Consortium. 2002. A nomenclature system for the tree
925. of human Y-chromosomal binary haplogroups. Genome Res 12: 339–348.
Scozzari R, Massaia A, Trombetta B, Bellusci G, Myres NM, Novelletto A, Yan S, Wang CC, Zheng HX, Wang W, Qin ZD, Wei LH, Wang Y, Pan XD, Fu
Cruciani F. 2014. An unbiased resource of novel SNP markers provides WQ, He YG, et al. 2014. Y chromosomes of 40% Chinese descend from
a new chronology for the human Y chromosome and reveals a deep phy- three Neolithic super-grandfathers. PLoS One 9: e105691.
logenetic structure in Africa. Genome Res 24: 535–544. Zerjal T, Xue Y, Bertorelle G, Wells RS, Bao W, Zhu S, Qamar R, Ayub Q,
Sengupta S, Zhivotovsky LA, King R, Mehdi SQ, Edmonds CA, Chow CE, Lin Mohyuddin A, Fu S, et al. 2003. The genetic legacy of the Mongols.
AA, Mitra M, Sil SK, Ramesh A, et al. 2006. Polarity and temporality of Am J Hum Genet 72: 717–721.
high-resolution Y-chromosome distributions in India identify both in- Zhivotovsky LA, Underhill PA, Cinnioglu C, Kayser M, Morar B, Kivisild
digenous and exogenous expansions and reveal minor genetic influence T, Scozzari R, Cruciani F, Destro-Bisol G, Spedini G, et al. 2004. The ef-
of Central Asian pastoralists. Am J Hum Genet 78: 202–221. fective mutation rate at Y chromosome short tandem repeats, with
Shennan S, Downey SS, Timpson A, Edinborough K, Colledge S, Kerig T, application to human population-divergence time. Am J Hum Genet
Manning K, Thomas MG. 2013. Regional population collapse followed 74: 50–61.
initial agriculture booms in mid-Holocene Europe. Nat Commun 4:
2486.
Skoglund P, Malmstrom H, Omrak A, Raghavan M, Valdiosera C, Gunther T,
Hall P, Tambets K, Parik J, Sjogren KG, et al. 2014. Genomic diversity Received November 6, 2014; accepted in revised form February 13, 2015.
8 Genome Research
www.genome.org
Downloaded from genome.cshlp.org on March 16, 2015 - Published by Cold Spring Harbor Laboratory Press
Supplemental http://genome.cshlp.org/content/suppl/2015/02/18/gr.186684.114.DC1.html
Material
P<P Published online March 13, 2015 in advance of the print journal.
Creative This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the
Commons first six months after the full-issue publication date (see
License http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under
a Creative Commons License (Attribution-NonCommercial 4.0 International), as
described at http://creativecommons.org/licenses/by-nc/4.0/.
Email Alerting Receive free email alerts when new articles cite this article - sign up in the box at the
Service top right corner of the article or click here.