Tuesday, February 23, 2010

Phylogenetic tree of human Y-chromosome haplogroups observed in Eurasians

 
click to enlarge
A couple of things I notice on the chart:
  1. R2 is shown as younger than R1/R1a/R1b, which I think is probably true.
  2. An Asian source for all Europeans - excluding haplogroup E.
link

    The Human Genetic History of South Asia

    Partha P. Majumder

    Summary:

    South Asia — comprising India, Pakistan, countries in the sub-Himalayan region and Myanmar — was one of the first geographical regions to have been peopled by modern humans. This region has served as a major route of dispersal to other geographical regions, including southeast Asia. The Indian society comprises tribal, ranked caste, and other populations that are largely endogamous. As a result of evolutionary antiquity and endogamy, populations of India show high genetic differentiation and extensive structuring. Linguistic differences of populations provide the best explanation of genetic differences observed in this region of the world. Within India, consistent with social history, extant populations inhabiting northern regions show closer affinities with Indo-European speaking populations of central Asia that those inhabiting southern regions. Extant southern Indian populations may have been derived from early colonizers arriving from Africa along the southern exit route. The higher-ranked caste populations, who were the torch-bearers of Hindu rituals, show closer affinities with central Asian, Indo-European speaking, populations.

    Current Biology, Volume 20, Issue 4, R184-R187, 23 February 2010
    Full article (link)

    Friday, February 19, 2010

    Jews of Haplogroup R2

    Haplogroup R2 is rather rare outside India, where it accounts for about 90% of all men on Earth having R2. In India it has been observed in about 10% of male population, in Pakistan – about 7-8%.  In Tadzhikistan, neighboring India, haplogroup R2 is met in about 6% of the population. Some singular percentage of population having R2 can be met in the area of Caucasus, among Azerbaidzhanians, Armenians, Georgians, Chechens. It is conjectured that in these areas haplogroup R2 was introduced by the Gypsies, who carry haplogroup R2 with frequency of more than 50% of their population. The next main haplogroup in the Gypsies is H, as it was described in the preceding paper in this issue with an example of Bulgarian Gypsies. It is surmised that haplogroup R2 was originated some 25 thousand years ago.

    The Gypsies have brought haplogroup R2 to Europe in medieval times, some 500-700 years ago, apparently first to Bulgaria, Germany and Austria (under the Gypsies names of Sinti and Roma), and then spread over Europe. This haplogroup was recently found among the Jews, and immediately it was suggested by the scholars that it came from the Khazars. No justifications and no time estimates were given.   

    Recently (Sengupta et al, 2006) a large set of Indian and Pakistani haplotypes was published, including more than 900 haplotypes. 81 of them belonged to haplogroup R2. Since as many as 21 of identical six-marker haplotypes (the base haplotypes) from those 81 are observed, as follows 

    14-12-23-10-10-14

    it is rather obvious that these haplotypes cannot be too old. Indeed, ln (81/21)/0.0096 = 141 generations (163 with correction for back mutations) to a common ancestor. All 81 haplotypes contain 108 mutations from the above base haplotype, which gives 108/81/0.0096 = 139 generations (161 with correction for back mutations). It is a practically absolute fit, indicating that it was a single ancestor who originated the lineage of R2 haplogroup in India 4,000 years BP.

    However, it seems that the actual time of origination of R2 haplogroup was much earlier. The R2 section of YSearch data base contains 34 haplotypes of individuals. Half of them are ethnic Indians, plus some Scotts, French, Italians, Armenians. Twelve individuals have names of their predecessors as Abraham, Isaac, Lebe, Mordecai, etc., and some of them presented supplementary information indicating that they are Ashkenazi Jews.  The most frequent 6-marker haplotype among those 39 individuals is

    14-12-23-10-10-14

    which is exactly the same as that the base haplotype of haplogroup R2 in India- Pakistan, shown above. However, if to remove the Jewish haplotypes (which, as it is shown below, are derived from a recent ancestor), the remaining 22 haplotypes contain 35 mutations, that translates into 198 generations from a common ancestor. In 22 of the 12-marker haplotypes there were 101 mutations, which give 236 generations from a common ancestor. In 7 of the 37-marker haplotypes amount of mutations in the 12-, 25- and 37-marker panels corresponded to 282, 259 and 207 generations to a common ancestor. These four figures being averaged give 246±32 generations, that is about 6,200±800 years to a common ancestor of the non-Jewish individuals of R2 haplogroup in YSearch database. This might be a good indication that haplogroup R2 had originated not in India, since the Indian R2 haplogroups were derived from a significantly “younger” ancestor who lived about 4,000 years BP (see above), that is some 2,000 years later that an older bearer of the R2 haplogroup. In any case, this question needs more detailed studies.

    And when a common ancestor of the Jews of haplogroup R2 had lived?

    The most frequent 12-marker haplotype among those 34 individuals, Jewish and not, is 

    14-23-14-10-13-20-12-12-11-14-10-29

    which is exactly the same as that for the Jewish individuals of haplogroup R2 in YSearch database.  

    Let us now consider the Jewish haplotypes in more detail.  

    6-marker haplotypes  

    11 of the 12 Jewish R2 haplotypes are identical to each other (Fig. 50), and their
    6-marker base (ancestral) haplotype is

    14-12-23-10-10-14

    that is the same as the most popular among known bearers of R2 haplogroup in India and elsewhere in the world.

    Figure 50. The 6-marker haplotype tree for 12 Jewish haplotypes of haplogroup R2. A “commercial” set (YSearch database)

    Formally, 11 base haplotypes out of 12 give ln(12/11)/0.0096 = 9 generations, and one mutation in all twelve 6-marker haplotypes gives 1/12/0.0096 = 9 generations to a common ancestor, and the identity of these figures point out at a single ancestor for all 12 individuals in  the set of their 6-marker haplotypes. However, as it has happened before, this tentative conclusion should be examined with more extended haplotypes. It is too often when 6-marker haplotypes, particularly in small haplotype sets, do not reveal mutations which occur in more extended panels of the haplotype.   

    12-marker haplotypes 

    Indeed, a move to the 12-marker tree (Fig. 51) immediately shows that there are two groups of the Jewish haplotypes, with an “older” and a “younger” haplotypes, descending from the same ancestor. Half of all 12 haplotypes still represent the base (ancestral) haplotype

    14-23-14-10-13-20-12-12-11-14-10-29

    Since their other mutations will be revealed by moving to more extended haplotypes, an estimate of a time span to the common ancestor based on the 12- marker haplotypes will be only tentative.

    Figure 51. The 12-marker haplotype tree for 12 Jewish haplotypes of haplogroup R2. A “commercial” set (YSearch database)

    This 12-marker base haplotype is exactly the same as the most frequent 12- marker haplotype in YSearch database,  only one-third of which represent the Jewish haplotypes.

    6 base haplotypes from the total 12 Jewish haplotypes would point to 29 generations to a common ancestor, since ln(12/6)/0.024 = 29. The other 6 haplotypes contain 15 mutations with respect to the above base haplotype. This would lead to 15/12/0.024 = 52 generations to a common ancestor. This mismatch (29 and 52) indicates that there were more than just one common ancestor for the Jews in R2 haplogroup. In fact, Fig. 51 makes it rather obvious.

    As it turned out, many more mutations have occurred in the 13-37 marker panel of the distant branch, which distinctly separates the two branches. However, even the 12-marker tree shows the principal separation of the two groups of haplotypes.

    37-marker haplotypes 

    The 37-marker haplotype tree is shown in Fig. 52. It reveals a striking feature of R2 Jewish haplotypes. Though there are only 7 haplotypes on the tree, they clearly show that Jewish R2 haplotypes splits indeed into two quite distant groups. Statistics is insufficient to perform detailed analysis, however, there are still enough data available to make some principal conclusions. One group of haplotypes, which are located on the right-hand side of the tree (Fig. 52) and much closer to the trunk of the tree (that is, to the present times), has the same 12- marker base haplotype as shown immediately above and in the Table, and corresponds to the group of the base haplotypes around the 12-marker tree (Fig. 51). This 4-haplotype branch has only 9 mutations with respect to the base 37- marker haplotype, and refers to a common ancestor who lived only 26 generations BP, 650±50 years ago, in the 14th  century.

    Figure 52. The 37-marker haplotype tree for 7 Jewish haplotypes of haplogroup R2. A “commercial” set (YSearch database)

    Another group of Jewish R2 haplotypes, represented with three distant haplotypes on the left-hand side in Fig. 52, shows a base 12-marker haplotype (the 37-marker haplotype is shown in the Table):

    14-23-14-10-13-20-12-12-10-13-10-31

    It turned out that these three haplotypes (the left-hand side in Fig. 52) have only two mutations in their 37-marker haplotypes, that is among 111 alleles. This formally places their common ancestor only 7 generations BP, that is about two hundred years ago. All three are relatives within seven generations.

    Overall, there are 21 mutations between these two base (ancestral) haplotypes in the 37-marker format. This means that these two haplotypes are separated by thousands of years of separate mutations, and, more specifically, this separation is approximately equivalent to 305 generations between the two, that is about 7,600 years. This places their common ancestors about 4,200 years BP, and fits pretty well with the common ancestor of Indian R2 haplogroup of 4,000 years BP, see above. It is very likely that the both lineages, “young” and the ancient one, are derived from the Gypsies in Europe. The “young” is traced down to the invasion time or a bottleneck time to the Jewish community, and the “older” is traced down to the ancient common ancestor in India. At any rate, both Jewish ancestral haplotypes, shown above in their 12-marker format and in the Table 2 in a 37-marker format are derived from two quite unrelated individuals, whose haplotypes had evolved from the very initial survivors in haplogroup R2, but traced down in millennia apparently to India, through the Gypsies.

    Some historical conjectures

    Here is a plausible story of the Jewish haplotypes of R2 haplogroup. Its ancestral haplotype  

    14-23-14-10-13-20-12-12-11-14-10-29

    shown here in the 12-marker format, is about 4,200 years old, that corresponds to the age of this haplotype in India (see above). This haplotype had arrived to Europe apparently with the Gypsies, in the Medieval times, some 800 years BP, and got into the Jewish community. About 30-40% of the present day Jews, bearers of R2 haplogroup, are direct descendant of those Gypsies, or the Indians, on that matter. Approximately 650 years ago, apparently during the Black Plague times, in the 14th century, a bearer of this haplogroup, albeit in the mutated form had survived and fled to the Eastern Europe. This was a bottleneck for this particular haplotype. Close to half of present day Jews are descendants of that individual.

    This story is a mirror one of the Jewish Q haplotypes story (see the preceding section). Apparently, the 14th  century, the Black Plague times, created a number of bottleneck situations for the Jews of a number of haplotypes, and not for the Jews only.   
     
    The second Jewish R2 haplotype

    14-23-14-10-13-20-12-12-10-13-10-31

    in the 12-marker format, got to the Jewish community quite recently, merely two hundred years ago. It is very different from the first one. Its three bearers lived in the 19th century in Hungary, Romania and Lithuania. Their current descendants probably do not know that they are rather close relatives. Two of them differ by only 3 mutations in their 66-marker haplotypes.


    From
    Origin of the Jews via DNA Genealogy
    Anatole A. Klyosov

    Sunday, February 14, 2010

    Haplogroup R2 - Highlights from various studies

    • The most frequent haplogroups among the Indian upper castes belonged to R subclades (R*, R1 and R2) and that among the lower castes and tribal populations to haplogroup H and the distribution pattern of the major Y-lineages was observed to be similar in tribal and lower caste populations, and distinct from the upper castes thereby suggesting a tribal origin for the Indian lower castes, unaffected by geography.
    • The presence of west/central Asian lineages (J2, R1 and R2) and its higher STR diversity in most of the tribes suggested its presence in India much before the arrival of Indo-European pastoralists.
    • With regard to the caste populations, a South Asian origin for the Indian caste communities with minimal Central Asian influence was proposed based on the absence of certain haplogroups in Indian samples (C3, DE, J*, I, G, N and O) which covers almost half of the Central Asian Y- chromosomes and the presence of some haplogroups in Indian Y-chromosomes (C*, F*, H, L and R2) that is poor in Central Asia.
    • The claims for the association of haplogroups J2, L, R1a and R2 with the origin of majority of the caste’s paternal lineage from outside India, was rejected.
    • Indians showed the presence of diverse lineages of three major Eurasian Y-chromosomal haplogroups, C, F and K and the exclusive presence of several subclusters of F and K (H, L, R2 and F*) in the Indian subcontinent (especially H, L and R2) was consistent with the scenario that the southern route migration from Africa carried the ancestral Eurasian lineages to the Indian subcontinent.
    • Our analysis revealed that haplogroup R2 characterizes 13.5% of the Indian Y-chromosomes and its frequency among Dravidian speakers was comparable to that of haplogroup H (20.9%) and significantly different from Indo-European and Austro-Asiatic speakers (?2= 16.2, d=3, p<0.05). While the distribution across various geographic regions was almost uniform, significant differentiation was observed along the social groups (?2= 18.7, d=3, p<0.05); a decreasing gradient was discernible as one moved up the caste hierarchy. Although tribes contributed only 7.4% of the total R2 lineage, it was proportionately distributed between the Austro-Asiatic and Dravidian tribes. Extensive analysis of its distribution between north and south Indian populations showed that while there was marginal difference among middle and lower caste groups of north India (17.1 and 17.4 % respectively), a clear gradient was observed among south Indians, where the frequency declined by more than one-half from lower to upper caste groups. Analysis of 20-Y-STRs within the R2 lineage revealed that three haplotypes were shared; one between Kamma Chaudhary and Kappu Naidu, both lower caste Dravidian speakers from Andhra Pradesh and two within Karmali and Pallar populations.
    • Four haplogroups; H= 23%; R1a1=17.5%; O2a=15% and R2=13.5%, form major paternal lineage of Indians and together account for ~70% of their Y-chromosomes.
    • The observed high frequency of R2 Y-chromosomes in Indians, which is equivalent to that of haplogroup H among Dravidian speakers, corroborates previous reports suggesting its Indian origin (Cordaux et al., 2004). The deep coalescence time for R2 lineages, dating back to Late Pleistocene, supports its indigenous origin. Outside India, it is found in Iran and Central Asia (3.3%) and among Roma Gypsies of Europe, known to have historical evidence of their migration from India (Wells et al., 2001). Within India, while it is predominant in both eastern and southern regions, its distribution pattern is rather patchy in east (Sahoo et al., 2006). It is most likely that genetic drift or bottleneck has reduced the paternal diversity of Karmali, which contributes 28% of the eastern R2 lineages. This population although considered to be Austro-Asiatic speak- er, does not present any evidence of O2a Y- chromosome lineage, portraying a distinctly different history.
    • Further, deeper coalescence age for the Y-chromosome haplogroups C, H, R2 compared to O2a is consistent with hypothesis that Austro-Asiatic speakers cannot be considered as the earliest settlers of South Asia.
    • Based on deep coalescence age estimates of H, R2 and C Y-chromosome lineages, their diversity and distribution pattern, our data suggests an early Pleistocene settlement of South Asia by Dravidian speaking south Indian populations; the Austro-Asiatic speakers migrated much later from SE Asia and probably contributed only paternal lineages while amalgamating with the aboriginal populations of the region.
    High Resolution Phylogeographic Map of Y-Chromosomes Reveal the Genetic Signatures of Pleistocene Origin of Indian Populations
    R. Trivedi, Sanghamitra Sahoo, Anamika Singh, G. Hima Bindu, Jheelam Banerjee, Manuj Tandon, Sonali Gaikwad, Revathi Rajkumar, T Sitalaximi, Richa Ashma, G. B. N. Chainy and V. K. Kashyap
    • The high frequency and STR diversity of haplogroup R2 in Indians corroborates its Indian origin.
    • It has also been reported in Iran and Central Asia with marginal frequency, which more likely suggests a recent migration from India. It is present at high frequency (53%) among Gypsies of Uzbekistan, known to have historically migrated out from India. Interestingly, this haplogroup is absent or infrequent among Gypsies of Europe whose predominant Y chromosome haplogroup is H.
    • “The proposition that a high frequency of R1a in India is caused by admixture with populations of Central Asian origin is difficult to substantiate, as the proposed source region does not meet the expectation of containing high frequencies of the other components of haplogroup R, with no examples of R* and generally low incidence of R2, which, unlike J2, does not show evidence of a recent diffusion throughout India from the northwest.  The distribution of  R2, with its concentration in Eastern and Southern India, is not consistent with a recent demographic movement from the northwest. Instead, its prevalence among castes in these regions might   represent a recent population expansion, perhaps associated with the transition to agriculture, which may have occurred independently in South Asia.”
    • Departing from the ‘‘one haplogroup equals one migration’’ scenario, Cordaux et al. defined, heuristically, a package of haplogroups (J2, R1a, R2, and L) to be associated with the migration of IE people and the introduction of the caste system to India, again from Central Asia, because they had been observed at significantly lower proportions in South Indian tribal groups, with the high frequency of R1a among Chenchus of Andhra Pradesh considered as an aberrant phenomenon. Conversely, haplogroups H, F*, and O2a, which were observed at significantly higher proportions among tribal groups of South India, led the same authors to single them out as having an indigenous Indian origin.
    Gyaneshwer Chaubey et al., “Peopling of South Asia: investigating the caste-tribe continuum in India,” BioEssays 29, no. 1 (2007): 91-100. 
    • Similarly, the proposition that a high frequency of R1a in India is caused by admixture with populations of Central Asian origin is difficult to substantiate, as the proposed source region does not meet the expectation of containing high frequencies of the other components of haplogroup R, with no examples of R* and generally low incidence of R2, which, unlike J2, does not show evidence of a recent diffusion throughout India from the north- west.
    • Second, it is notable that the results from the ADMIX2 program gave relatively high reciprocal admixture (0.3–0.35) proportions for Northwest Indian and Central Asian populations, despite the incompatibility of the respective haplogroup frequency pools; our Northwest Indian sample totally lacks haplogroups C3, DE, J*, I, G, N, and O, which cover almost half of the Central Asian Y chromosomes, whereas the Central Asian sample is poor in haplogroups C*, F*, H, L, and R2 (with a combined frequency of 10%). Hence, the admixture proportions are driven solely by the shared high frequency of R1a. In other words, if the source of R1a variation in India comes from Central Asia, as claimed by Wells et al. and Cordaux et al., then, under a recent gene flow scenario, one would expect to find the other Central Asian-derived NRY haplogroups (C3, DE, J*, I, G, N, O) in Northwest India at similarly elevated frequencies, but that is not the case.
    • Alternatively, although the simple admixture scenario does not hold, one could nevertheless argue that the other haplogroups were lost during a hypothetical bottleneck (lineage sorting among the early Indo-Aryans arriving to India). But in line with this scenario, one should expect to observe dramatically lower genetic variation among Indian R1a lineages. In fact, the opposite is true: the STR haplotype diversity on the background of R1a in Central Asia (and also in Eastern Europe) has already been shown to be lower than that in India. Rather, the high incidence of R1* and R1a throughout Central Asian and East European populations (without R2 and R* in most cases) is more parsimoniously explained by gene flow in the opposite direction, possibly with an early founder effect in South or West Asia.
    • Rather, taken together with the evidence from Fst values, the elements discussed so far (i.e., admixture, factor analysis, and frequency distributions) are more parsimoniously explained by a predominantly pre-IE, pre-Neolithic presence in India, for the majority of those Y lineages considered here (R1a, R2, L1), which occur together with strictly Indian-specific haplogroups and paragroups (C*, F*, H) among both caste and tribal groups. The distribution of R2, with its concentration in Eastern and Southern India, is not consistent with a recent demographic movement from the northwest. Instead, its prevalence among castes in these regions might represent a recent population expansion, perhaps associated with the transition to agriculture, which may have occurred independently in South Asia (23). A pre-Neolithic chronology for the origins of Indian Y chromo- somes is also supported by the lack of a clear delineation between DR and IE speakers. Again, although appeals to language change are plausible for explaining the appearance of supposedly tribe-specific Y lineages among incoming IE speakers, it is much harder to conceive of a systematic movement of external Y- chromosome types in the opposite direction, via the uptake of DR languages. The near absence of L lineages within the IE speakers from Bihar (0%), Orissa (0%), and West Bengal (1.5%) further suggests that the current distribution of Y haplogroups in India is associated primarily with geographic rather than linguistic or cultural determinants.
    • It is not necessary, based on the current evidence, to look beyond South Asia for the origins of the paternal heritage of the majority of Indians at the time of the onset of settled agriculture. The perennial concept of people, language, and agriculture arriving to India together through the northwest corridor does not hold up to close scrutiny. Recent claims for a linkage of haplogroups J2, L, R1a, and R2 with a contemporaneous origin for the majority of the Indian castes’ paternal lineages from outside the subcontinent are rejected, although our findings do support a local origin of haplogroups F* and H. Of the others, only J2 indicates an unambiguous recent external contribution, from West Asia rather than Central Asia. The current distributions of haplogroup frequencies are, with the exception of theO lineages, predominantly driven by geographical, rather than cultural determinants. Ironically, it is in the northeast of India, among the TB groups that there is clear-cut evidence for large-scale demic diffusion traceable by genes, culture, and language, but apparently not by agriculture.
    A Prehistory of Indian Y-Chromosomes: Evaluating Demic Diffusion Scenarios
    Sanghamitra Sahoo et al, 2006, The National Academy of Sciences of the USA
    • H, L, and R2 are the major Indian Y-chromosomal haplogroups that occur both in castes and in tribal populations and are rarely found outside the subcontinent. Haplogroup R1a, previously associated with the putative Indo-Aryan invasion, was found at its highest frequency in Punjab but also at a relatively high frequency (26%) in the Chenchu tribe.
    • Altogether, three clades—H, L, and R2—account for more than one- third of Indian Y chromosomes. They are also found in decreasing frequencies in central Asians to the north and in Middle Eastern populations to the west. Unclassified derivatives of the general Eurasian clade F were observed most frequently (27%) in the Koyas.
    • The presence of several subclusters of F and K (H, L, R2, and F*) that are largely restricted to the Indian subcontinent is consistent with the scenario that the coastal (southern route) migration(s) from Africa carried the ancestral Eurasian lineages first to the coast of Indian subcontinent (or that some of them originated there). Next, the reduction of this general package of three mtDNA (M, N, and R) and four Y-chromosomal (C, D, F, and K) founders to two mtDNA (N and R) and two Y-chromosomal (F and K) founders occurred during the westward migration to western Asia and Europe. After this initial settlement process, each continental region (including the Indian subcontinent) developed its region-specific branches of these founders, some of which (e.g., the western Asian HV and TJ lineages) have, via continuous or episodic low-level gene flow, reached back to India. Western Asia and Europe have thereafter received an additional wave of genes from Africa, likely via the Levantine corridor, bringing forth lineages of Y-chromosomal haplogroup E, for example (Underhill et al. 2001b), which is absent in India.
    • Given the geographic spread and STR diversities of sister clades R1 and R2, the latter of which is restricted to India, Pakistan, Iran, and southern central Asia, it is possible that southern and western Asia were the source for R1 and R1a differentiation. Compared with western Asian populations, Indians show lower STR diversities at the haplogroup J background (Quintana-Murci et al. 2001; Nebel et al. 2002) and virtually lack J*, which seems to have higher frequencies in the Middle East and East Africa (Eu10 [Ne- bel et al. 2001]; Ht25 [Semino et al. 2002]) and is common also in Europe (Underhill et al. 2001b). Therefore, J2 could have been introduced to northwestern India from a western Asian source relatively recently and, subsequently, after co-mingling in Punjab with R1a, spread to other parts of India, perhaps associated with the spread of the Neolithic and the development of the Indus Valley civilization. This spread could then have also taken with it mtDNA lineages of haplogroup U, which are more abundant in the northwest of India, and the western Eurasian lineages of haplogroups H, J, and T.
    • …..the occurrence of Y- chromosome haplogroups L, H, R2, and R1a in both caste and isolated tribal populations suggests much of the existing Indian population structure is very old. Additionally, the high diversity of Y haplogroups R1a1 and R2 in both South Indian and Indus valley populations has led to the suggestion that there is little, if any, genetic influence from other Eurasians on the castes of South India.
    • The antiquity and complex geographic distribution of the R1a1 and R2 haplogroups led these authors to conclude that the majority of the subcontinent Y-chromosomes arrived in or before the early Holocene (10,000 years ago) rather than in a later Indo-European expansion. Likewise, and concordant with other studies of tribal Indian populations, we observe Y-chromosome R1a1 lineages in South Indian tribal Irula (unpublished data), a population substantially differentiated from South Indian castes.
    • Yet, the occurrence of Y- chromosome haplogroups L, H, R2, and R1a in both caste and isolated tribal populations suggests much of the existing Indian population structure is very old. Additionally, the high diversity of Y haplogroups R1a1 and R2 in both South Indian and Indus valley populations has led to the suggestion that there is little, if any, genetic influence from other Eurasians on the castes of South India.
    • On the basis of the combined phylogeographic distributions of haplotypes observed among populations defined by social and linguistic criteria, candidate HGs that most plausibly arose in situ within the boundaries of present-day India include C5-M356, F*-M89, H-M69* (and its sub-clades H1-M52 and H2-APT), R2-M124, and L1-M76. The congruent geographic distribution of H-M69* and potentially paraphyletic F*-M89 Y chromosomes in India suggests that they might share a common demographic history.
    • The decreasing frequency of R2—from 7.4% in Pakistan to 3.8% in Central Asia (Wells et al. 2001) to 1% in Turkey (Cinnioglu et al. 2004)—is consistent with the pattern observed for the autochthonous Indian H1-M52 HG.
    • On the basis of a broad distribution—involving all social and linguistic categories in India—and relatively high diversification patterns, it can be concluded that representatives of HGs C5-M356 H-M69*, F*, L1, and R2 have ancestry indigenous to the Asian subcontinent.
    • In HGs R1a1 and R2, the associated mean microsatellite variance is highest in tribes, not castes. This is a clear contradiction of what would be expected from an explanation involving a model of recent occasional admixture. Beyond taking advantage of highly resolved phylogenetic hierarchy as just an efficient genotyping convenience, a comprehensive approach that leverages the phylogeography of Y-chromosome diversification by using a combination of HG diversification with geography and expansion-time estimates provides a more insightful and accurate perspective to the complex human history of South Asia.
    • When considered at the general HG level, L, R1a, and R2 all display approximate similarity with respect to population-category apportionment and frequency (Cordaux et al. 2004).
    • Although it would be convenient to assume that R1a1 and R2 representatives reflect a recent common demography (Cordaux et al. 2004), it is entirely plausible that they harbor as-yet-undiscovered subsequent haplogroup diversification that approximates the phylogeographic patterns revealed for HG L.
    • The phylogeography and the similarity of microsatellite variation of HGs R1a1 and R2 to L1-M76 in South Asian tribes argues that they likely share a common demographic history.
    • The distribution of HG R2-M124 is more circumscribed relative to R1a1, but it has been observed at informative levels in Central Asia, Turkey, Pakistan, and India. The distribution of R1a1 and R2 within India is similar, as are the levels of associated microsatellite variance. The ages of the Y-microsatellite variation for R1a1 and R2 in India suggest that the prehistoric context of these HGs will likely be complex.
    • .... there is no evidence whatsoever to conclude that Central Asia has been necessarily the recent donor and not the receptor of the R1a lineages. The current absence of additional informative binary subdivision within this HG obfuscates potential different histories hidden within this HG, making such interpretations as the sole and recent source area overly simplistic. The same can be said in respect to HG R2-M124.
    • The most frequent haplogroup among the Indian upper castes belongs to R lineages (R*, R1 and R2); together, these account for 44% of the upper caste Y-chromosomes. Haplogroup H was the most frequent Y lineage in both the lower castes and tribal populations, with frequencies of 0.25 and 0.30, respectively. The Indian Y-SNP tree (Figure 3) shows that the distribution pattern of the major Y lineages is similar in tribal and lower caste populations, and is distinct from the upper castes.
    • The sister clades; R1a1 (M17) and R2 (M124) of the M207 lineage together form the largest Y haplogroup lineage in India, with a frequency of 0.32. They are present in substantial frequencies throughout the subcontinent, irrespective of the regional and linguistic barriers. The haplogroup R-M17 also has a wide geographic distribution in Europe, West Asia and the Middle East, with highest frequencies in Eastern European populations [23]. It is proposed to be originated in the Eurasian Steppes, north of the Black and Caspian seas, in a population of the Kurgan culture known for the domestication of horse, ~3500 ybp [23], and widely been regarded as a marker for the male-mediated Indo-Aryan invasion of Indian subcontinent. However, these observations were contradicted by the higher STR variations observed in the Indian M17 and M124 samples, compared with the European and Central Asian populations, suggesting a much deeper time depth for the origin of the Indian M17 lineages. In the present study, it was observed that the R lineages were successfully penetrated to high frequencies (0.26) in the South Indian tribal populations, a testimony for its arrival in the peninsula much before the recent migrations of Indo-European pastoralists from Central Asia. In a recent study, Sengupta et al [24] observed higher microsatellite variance, and clustering together of Indian M17 lineages compared with the Middle East and Europe. They proposed that it is an early invasion of M17 during the Holocene expansion that contributed to the tribal gene pool in India, than a recent gene flow from Indo-European nomads. However, we found that its frequency is much higher in upper castes (0.44) compared to that of the lower caste (0.22) and tribal groups (0.26). This uneven distribution pattern shows that the recent immigrations from Central Asia also contributed undoubtedly to a pre-existing gene pool.
    • The presence of the so called west/central Asian lineages like J2, R1 and R2 in most of the endogamous tribal populations, and its higher STR diversity indicates its presence in the sub-continent much before the arrival of the Indo-European pastoralists. In short, the impact of their arrival in the Indian sub-continent is rather social and political, than genetic.
    • .....it was suggested that a package of Y-HGs (J2, R1a, R2 and L) was associated with the migration of Indo-European people from Central Asia.7 Although our study observed a high frequency of Y-HGs, R1a1, J*/J2, R2 and L, it was not exclusively restricted to any region or population (Table 1). Moreover, most of the population groups from the studied regions showed a less frequency of the highly frequent haplogroups of Central Asia: C3, DE, I, G, J*, N and O, except for some population-specific distributions.
    • The percentage distribution of haplogroups in Brahmins (n=256) showed a total of six most frequent (percentage >5%) haplogroups: R1a1* (40.63%), J2 (12.5%), R2 (8.59%), L (7.81%), H1 (6.25%) and R1* (5.47%), contributing to 81.25% of the total distribution in Brahmins. Tribals and scheduled castes (n=254) also showed six haplogroups: H1 (31.10%), R1a1* (20.47%), J2 (10.24%), L (7.87%), H* (7.87%) and O (6.69%), contributing in total to 84.25%.
    • All together (Brahmins, schedule castes and tribals), 22 Y-haplogroups were observed. The percentages of seven of these haplogroups (with percentage >5%) accounted for 85.5% of the total number of Y-chromosomes (n=2809). The haplogroups with their percentages in descending order were: R1a1* (21.1%), H1 (19.1%), R2 (10.5%), O (10.1%), L (9.5%), J*/J2 (8.3%) and F* (6.9%).
    • Five haplogroups out of 18 were found to be most frequent (>5%) in Brahmins (R1a1* (35.7%), J*/J2 (12.4%), L (11.3%), R2 (10.8%) and H1 (8.0%)) and represented 78.2% of the total number of samples (n =767), whereas haplogroup O was found to be very less frequent (0.7%) in Brahmin Y-chromosomes. Seven out of 14 haplogroups (with percentage >5%) (H1 (24.2%), R1a1* (17.2%), R2 (14.2%), L (12.2%), F* (9.8%), J*/J2 (6.4%) and K* (5.3%)) represented 89.3% of the total number of Dalit Y-chromosomes (n =674). Tribal Y-chromosomes represented by seven out of 20 haplogroups displayed percentages >5%: O (25.5%), H1 (25.3%), R1a1* (10.2%), F* (7.5%), R2 (6.4%), J*/J2 (6.1%) and L (5%) (86% of the total number of samples (n=1368)).
    Swarkar Sharma et al., “The Indian origin of paternal haplogroup R1a1* substantiates the autochthonous origin of Brahmins and the caste system,” J Hum Genet 54, no. 1 (January 9, 2009): 47-55.

    (Y Haplogroups and Aggressive Behavior in a Pakistani Ethnic Group)
    • Five Y haplogroups that are commonly found in Eurasia and Pakistan comprised 87% (n=136) of the population sample, with one haplogroup, R1a1, constituting 55% of the sampled population. A comparison of the total and four sub-scale mean scores across the five common Y haplogroups that were present at a frequency >=3% in this ethnic group revealed no overall significant differences. However, effect-size comparisons allowed us to detect an association of the haplogroups R2 (Cohen’s d statistic5.448–.732) and R1a1 (d5.107–.448) with lower self reported aggression mean scores in this population.
    • Mean scores were lowest for haplogroup R2 (58.83) and highest for J2a2 (74.60).
    • Using measures that are independent of sample size, we were able to detect an effect-size association of haplogroups R2 and R1a1, with lower mean scores indicating that the male-specific regions of the Y chromosome may contribute to self-reported human aggressive behavior. Both R1a1 and R2 share a common ancestor on the Y phylogenetic tree [Karafet et al., 2008] and membership in haplogroup R defined by the M207 mutation accounts for a small effect size (Cohen’s d statistic=.120) and 6% of the variance in the mean scores (r =241). This association of haplogroup R, which is a frequent haplogroup found in extant Indo-European populations, raises an intriguing possibility that behavior could have played a role in its evolutionary selection.
    S. Shoaib Shah et al., “Y haplogroups and aggressive behavior in a Pakistani ethnic group,” Aggressive Behavior 35, no. 1 (2009): 68-74.

    Saturday, February 13, 2010

    Thursday, February 11, 2010

    FTDNA's public yDNA haplogroup projects

    Projects  
    A ..................... http://www.familytreedna.com/public/Haplogroup_A/
    C ..................... http://www.familytreedna.com/public/Chaplogroup/
    E-M35 ................. http://www.familytreedna.com/public/E3b/
    F ..................... http://www.familytreedna.com/public/F-YDNA/
    G ..................... http://www.familytreedna.com/public/G-YDNA/
    G2c (formerly G5) ..... http://www.familytreedna.com/public/G2c/
    E1a  .................. http://www.familytreedna.com/public/HaplogroupE1andE/
    H ..................... http://www.familytreedna.com/public/YHaploGroupH/
    I1 .................... http://www.familytreedna.com/public/yDNA_I1/
    I2a ................... http://www.familytreedna.com/public/I2aHapGroup/
    I2b2-L38 .............. http://www.familytreedna.com/public/I2b2/
    I-M223 ................ http://www.familytreedna.com/public/M223-Y-Clan/
    I-P109 "I1c" .......... http://www.familytreedna.com/public/yDNA_I-P109/
    I ..................... http://www.familytreedna.com/public/Haplogroup%20I%20Y-DNA%20Project%20Web%20Site/
    J ..................... http://www.familytreedna.com/public/Y-DNA_J/
    J2 .................... http://www.familytreedna.com/public/J2%20Y%20DNA%20group/index.aspx/
    J2b_455-8 ............. http://tracingroots.nova.org/J2b_455-8.htm
    J2b-M102+ ............. http://www.familytreedna.com/public/m102/
    J2Plus ................ http://www.familytreedna.com/public/J2Plus/
    Jewish E3b ............ http://www.familytreedna.com/public/JewishE3bProject/
    Jewish Q .............. http://www.familytreedna.com/public/Jewish_Q/
    L ..................... http://www.familytreedna.com/public/Y-Haplogroup-L/
    N ..................... http://www.familytreedna.com/public/N%20Y-DNA%20Project/
    O...O3 ................ http://www.familytreedna.com/public/o3/
    Q3 American Indian .... http://www.familytreedna.com/public/Amerind%20Y/
    Q ..................... http://www.familytreedna.com/public/yDNA_Q/
    R* .................... http://www.familytreedna.com/public/Rasterisk/
    R1* ................... http://www.familytreedna.com/public/R1Asterisk/
    R1a Y-DNA Haplogroup .. http://www.familytreedna.com/public/R1aY-Haplogroup/
    R1b and Subclades ..... http://www.familytreedna.com/public/r1b/
    R1b (U152+) ........... http://www.familytreedna.com/public/R1b1c10/
    R1b Jewish ............ http://www.familytreedna.com/public/JewishR1b/
    R1b1* ................. http://www.familytreedna.com/public/R1b1Asterisk/
    R1b1b1 (aka R-M73) .... http://www.familytreedna.com/public/R1b1b1/
    R1b1b2a1b4 SRY2627+ ... http://www.familytreedna.com/public/R1b1c6/
    r1b1b2Asterisk ........ http://www.familytreedna.com/public/r1b1b2/
    R1b-U106 .............. http://www.familytreedna.com/public/U106/
    R1b-U198/S29+ ......... http://www.familytreedna.com/public/U198/
    R2 .................... http://www.familytreedna.com/public/R2/
    R-ht35 (P312- U106-) .. http://www.familytreedna.com/public/ht35new/
    R-L21Plus ............. http://www.familytreedna.com/public/R-L21/
    R-M153 ................ http://www.familytreedna.com/public/R-M153_The_Basque_Marker/
    R-M222 ................ http://www.familytreedna.com/public/R1b1c7/
    R-P312 and Subclades .. http://www.familytreedna.com/public/atlantic-r1b1c/
    T ..................... http://www.familytreedna.com/public/Y-Haplogroup-K2/
     
    Links to Join Projects (if you've tested with FTDNA)
    A ...................... https://www.familytreedna.com/group-join-request.aspx?group=AYDNA
    C ...................... https://www.familytreedna.com/group-join-request.aspx?group=C
    E-M35 .................. https://www.familytreedna.com/group-join-request.aspx?group=E-M35_Project
    F ...................... https://www.familytreedna.com/group-join-request.aspx?group=F-YDNA
    G ...................... https://www.familytreedna.com/group-join-request.aspx?group=G_Haplogroup
    G2c (formerly G5) ...... https://www.familytreedna.com/group-join-request.aspx?group=G2c
    E1a .................... https://www.familytreedna.com/group-join-request.aspx?group=HaplogroupE1a
    H ...................... https://www.familytreedna.com/group-join-request.aspx?group=H-YDNA
    I1 ..................... https://www.familytreedna.com/group-join-request.aspx?group=I1
    I2a .................... https://www.familytreedna.com/group-join-request.aspx?group=I2a
    I2b2-L38 ............... https://www.familytreedna.com/group-join-request.aspx?group=I-L38
    I-M223 ................. https://www.familytreedna.com/group-join-request.aspx?group=I2b
    I-P109 "I1c" ........... https://www.familytreedna.com/group-join-request.aspx?group=I1c
    I ...................... https://www.familytreedna.com/group-join-request.aspx?group=I-Y-DNA
    J ...................... https://www.familytreedna.com/group-join-request.aspx?group=J-Y-DNA
    J2 ..................... https://www.familytreedna.com/group-join-request.aspx?group=J2
    J2b_455-8 .............. https://www.familytreedna.com/group-join-request.aspx?group=J2b_455-8
    J2b-M102+ .............. https://www.familytreedna.com/group-join-request.aspx?group=M102plus
    J2Plus ................. https://www.familytreedna.com/group-join-request.aspx?group=J2Plus
    Jewish E3b ............. https://www.familytreedna.com/group-join-request.aspx?group=Jewish_E3b
    Jewish Q ............... https://www.familytreedna.com/group-join-request.aspx?group=Jewish_Q
    L ...................... https://www.familytreedna.com/group-join-request.aspx?group=L
    N ...................... https://www.familytreedna.com/group-join-request.aspx?group=N-YDNA
    O...O3 ................. https://www.familytreedna.com/group-join-request.aspx?group=O3
    Q3 American Indian ..... https://www.familytreedna.com/group-join-request.aspx?group=Q3_AmericanIndian
    Q ...................... https://www.familytreedna.com/group-join-request.aspx?group=Q-YDNA
    R* ..................... https://www.familytreedna.com/group-join-request.aspx?group=R*
    R1* .................... https://www.familytreedna.com/group-join-request.aspx?group=R1*
    R1a .................... https://www.familytreedna.com/group-join-request.aspx?group=R1a
    R1b and Subclades ...... https://www.familytreedna.com/group-join-request.aspx?group=R1b
    R1b (U152+) ............ https://www.familytreedna.com/group-join-request.aspx?group=R1b1c10
    R1b Jewish ............. https://www.familytreedna.com/group-join-request.aspx?group=R1b_Jewish
    R1b1* .................. https://www.familytreedna.com/group-join-request.aspx?group=R1b1Asterisk
    R1b1b1 (aka R-M73) ..... https://www.familytreedna.com/group-join-request.aspx?group=R1b1b
    R1b1b2a1b4 SRY2627+ .... https://www.familytreedna.com/group-join-request.aspx?group=R1b1c6
    r1b1b2Asterisk ......... https://www.familytreedna.com/group-join-request.aspx?group=r1b1b2Asterisk
    R1b-U106 ............... https://www.familytreedna.com/group-join-request.aspx?group=R1b-U106
    R1b-U198/S29+ .......... https://www.familytreedna.com/group-join-request.aspx?group=R1b1b2g1
    R2 ..................... https://www.familytreedna.com/group-join-request.aspx?group=R2
    R-ht35 (P312- U106-) ... https://www.familytreedna.com/group-join-request.aspx?group=ht35
    R-L21Plus .............. https://www.familytreedna.com/group-join-request.aspx?group=R-L21Plus
    R-M153 ................. https://www.familytreedna.com/group-join-request.aspx?group=R-M153
    R-M222 ................. https://www.familytreedna.com/group-join-request.aspx?group=R-M222
    R-P312 and Subclades ... https://www.familytreedna.com/group-join-request.aspx?group=R-P312_and_Subclades
    T (formerly K2) ........ https://www.familytreedna.com/group-join-request.aspx?group=K2-Male