Skip to main content

Advertisement

Log in

Single haplotype admixture models using large scale HLA genotype frequencies to reproduce human admixture

  • Original Article
  • Published:
Immunogenetics Aims and scope Submit manuscript

Abstract

The human leukocyte antigen (HLA) is the most polymorphic region in humans. Anthropologists use HLA to trace populations’ migration and evolution. However, recent admixture between populations can mask the ancestral haplotype frequency distribution. We present a statistical method based on high-resolution HLA haplotype frequencies to resolve population admixture using a non-negative matrix factorization formalism and validated using haplotype frequencies from 56 world populations. The result is a minimal set of source components (SCs) decoding roughly 90% of the total variance in the studied admixtures. These SCs agree with the geographical distribution, phylogenies, and recent admixture events of the studied groups. With the growing population of multi-ethnic individuals, or individuals that do not report race/ethnic information, the HLA matching process for stem-cell and solid organ transplants is becoming more challenging. The presented algorithm provides a framework that facilitates the breakdown of highly admixed populations into SCs, which can be used to better match the rapidly growing population of multi-ethnic individuals worldwide.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data availability

The authors affirm that all data necessary for confirming the conclusions of the article are present within the article, figures, and tables or in published datasets.

References

  • Abi-Rached L, Jobin MJ, Kulkarni S, McWhinnie A, Dalva K, Gragert L, Babrzadeh F, Gharizadeh B, Luo M, Plummer FA (2011) The shaping of modern human immune systems by multiregional admixture with archaic humans. Science 334(6052):89–94

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Aguilar A, Roemer G, Debenham S, Binns M, Garcelon D, Wayne RK (2004) High MHC diversity maintained by balancing selection in an otherwise genetically monomorphic mammal. Proc Natl Acad Sci U S A 101(10):3490–3494

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Alexander DH, Novembre J, Lange K (2009) Fast model-based estimation of ancestry in unrelated individuals. Genome Res 19(9):1655–1664

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Alter I, Gragert L, Fingerson S, Maiers M, Louzoun Y (2017) HLA class I haplotype diversity is consistent with selection for frequent existing haplotypes. PLoS Comput Biol 13(8):e1005693

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Apanius V, Penn D, Slev PR, Ruff LR, Potts WK (1997) The nature of selection on the major histocompatibility complex. Crit Rev Immunol 17(2):179–224

    Article  CAS  PubMed  Google Scholar 

  • Balding DJ (2006) A tutorial on statistical methods for population association studies. Nat Rev Genet 7(10):781–791

    Article  CAS  PubMed  Google Scholar 

  • Behar DM, Yunusbayev B, Metspalu M, Metspalu E, Rosset S, Parik J, Rootsi S, Chaubey G, Kutuev I, Yudkovsky G (2010) The genome-wide structure of the Jewish people. Nature 466(7303):238–242

    Article  CAS  PubMed  Google Scholar 

  • Bird CE, Karl SA, Smouse PE, Toonen RJ (2011) Detecting and measuring genetic differentiation. Phylogeography and population genetics in Crustacea 19:31–55

    Article  Google Scholar 

  • Brand A, Doxiadis I, Roelen D (2013) On the role of HLA antibodies in hematopoietic stem cell transplantation. HLA 81(1):1–11

    CAS  Google Scholar 

  • Bryc K, Durand EY, Macpherson JM, Reich D, Mountain JL (2015) The genetic ancestry of African Americans, Latinos, and European Americans across the United States. Am J Hum Genet 96(1):37–53

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Chinen J, Buckley RH (2010) Transplantation immunology: solid organ and bone marrow. J Allergy Clin Immunol 125(2):S324–S335

    Article  PubMed  PubMed Central  Google Scholar 

  • Chua EW, Kennedy MA (2012) Current state and future prospects of direct-to-consumer pharmacogenetics. Front Pharmacol 3:152

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Consortium GP (2015) A global reference for human genetic variation. Nature 526(7571):68

    Article  CAS  Google Scholar 

  • Costa CL,Schneider DM, Ramos MF, de Aguiar MA (2017) “Constructing phylogenetic trees in individual based models.” arXiv preprint arXiv:1709.04416

  • Excoffier L, Laval G, Schneider S (2005) Arlequin (version 3.0): an integrated software package for population genetics data analysis. Evol Bioinformatics Online 1:47

    CAS  Google Scholar 

  • Excoffier L, Smouse PE, Quattro JM (1992) Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics 131(2):479–491

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Ferrell PB, McLeod HL (2008). “Carbamazepine, HLA-B* 1502 and risk of Stevens–Johnson syndrome and toxic epidermal necrolysis: US FDA recommendations.”

  • Fujimura JH, Rajagopalan R (2010) Different differences: the use of ‘genetic ancestry’ versus race in biomedical human genetic research. Soc Stud Sci. https://doi.org/10.1177/0306312710379170

    Article  Google Scholar 

  • Gaujoux R, Seoighe C (2010) A flexible R package for nonnegative matrix factorization. BMC bioinformatics 11(1):367

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Geneugelijk K, Wissing J, Koppenaal D, Niemann M, Spierings E (2017) Computational approaches to facilitate epitope-based HLA matching in solid organ transplantation. J Immunol Res

  • Gragert L, Eapen M, Williams E, Freeman J, Spellman S, Baitty R, Hartzman R, Rizzo JD, Horowitz M, Confer D (2014a) HLA match likelihoods for hematopoietic stem-cell grafts in the US registry. N Engl J Med 371(4):339–348

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Gragert L, Fingerson S, Albrecht M, Maiers M, Kalaycio M, Hill BT (2014b) Fine-mapping of HLA ASSOCIATIONS with chronic lymphocytic leukemia in US populations. Blood 124(17):2657–2665

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Gragert L, Madbouly A, Freeman J, Maiers M (2013) Six-locus high resolution HLA haplotype frequencies derived from mixed-resolution DNA typing for the entire US donor registry. Hum Immunol 74(10):1313–1320

    Article  CAS  PubMed  Google Scholar 

  • Hammer MF, Karafet T, Rasanayagam A, Wood ET, Altheide TK, Jenkins T, Griffiths RC, Templeton AR, Zegura SL (1998) Out of Africa and back again: nested cladistic analysis of human Y chromosome variation. Mol Biol Evol 15(4):427–441

    Article  CAS  PubMed  Google Scholar 

  • Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS, Manolio TA (2009) Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci U S A 106(23):9362–9367

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Hollenbach JA, Saperstein A, Albrecht M, Vierra-Green C, Parham P, Norman PJ, Maiers M (2015) Race, ethnicity and ancestry in unrelated transplant matching for the National Marrow Donor Program: a comparison of multiple forms of self-identification with genetics. PloS one 10(8):e0135960

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Holoshitz J (2013) The quest for better understanding of HLA-disease association: scenes from a road less travelled by. Discov Med 16(87):93–101

    PubMed  PubMed Central  Google Scholar 

  • Kaeuffer R, Réale D, Coltman D, Pontier D (2007) Detecting population structure using STRUCTURE software: effect of background linkage disequilibrium. Heredity 99(4):374

    Article  CAS  PubMed  Google Scholar 

  • Kennedy GC, Matsuzaki H, Dong S, Liu W-m, Huang J, Liu G, Su X, Cao M, Chen W, Zhang J (2003) Large-scale genotyping of complex DNA. Nat Biotechnol 21(10):1233–1237

    Article  CAS  PubMed  Google Scholar 

  • Klitz W, Gragert L, Maiers M, Fernandez-Viña M, Ben-Naeh Y, Benedek G, Brautbar C, Israel S (2010) Genetic differentiation of Jewish populations. Tissue Antigens 76(6):442–458

    Article  CAS  PubMed  Google Scholar 

  • Kollman C, Maiers M, Gragert L, Müller C, Setterholm M, Oudshoorn M, Hurley CK (2007) Estimation of HLA-A,-B,-DRB1 haplotype frequencies using mixed resolution data from a national registry with selective retyping of volunteers. Hum Immunol 68(12):950–958

    Article  CAS  PubMed  Google Scholar 

  • Lam T, Shen M, Chia J, Chan S, Ren E (2013) Population-specific recombination sites within the human MHC region. Heredity 111(2):131

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Lee DD, Seung HS (2001) Algorithms for non-negative matrix factorization. Adv Neural Inf Proces Syst

  • Lobkovsky AE,Levi L,Wolf YI, Maiers M, Gragert L, Alter I, Louzoun Y, Koonin EV (2019) “Multiplicative fitness, rapid haplotype discovery, and fitness decay explain evolution of human MHC”. Proceedings of the National Academy of Sciences: 201714436

  • Madbouly A, Gragert L, Freeman J, Leahy N, Gourraud PA, Hollenbach JA, Kamoun M, Fernandez‐Vina M, Maiers M (2014) Validation of statistical imputation of allele‐level multilocus phased genotypes from ambiguous HLA assignments. Tissue antigens 84(3):285–92

    Article  CAS  PubMed  Google Scholar 

  • Maiers M, Halagan M, Joshi S, Ballal HS, Jagannatthan L, Damodar S, Srinivasan P, Narayan S, Khattry N, Malhotra P (2014) HLA match likelihoods for Indian patients seeking unrelated donor transplantation grafts: a population-based study. The Lancet Haematol 1(2):e57–e63

    Article  PubMed  Google Scholar 

  • Malaspinas A-S, Westaway MC, Muller C, Sousa VC, Lao O, Alves I, Bergström A, Athanasiadis G, Cheng JY, Crawford JE (2016) A genomic history of Aboriginal Australia. Nature 538(7624):207–214

    Article  CAS  PubMed  Google Scholar 

  • Manor S, Halagan M, Shriki N, Yaniv I, Zisser B, Maiers M, Madbouly A, Stein J (2016) High-resolution HLA A ∼ B ∼ DRB1 haplotype frequencies from the Ezer Mizion Bone Marrow Donor Registry in Israel. Hum Immunol 77(12):1114–1119

    Article  CAS  PubMed  Google Scholar 

  • McEvoy BP, Lind JM, Wang ET, Moyzis RK, Visscher PM, van Holst Pellekaan SM, Wilton AN (2010) Whole-genome genetic diversity in a sample of Australians with deep Aboriginal ancestry. Am J Hum Genet 87(2):297–305

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Moorjani P, Patterson N, Hirschhorn JN, Keinan A, Hao L, Atzmon G, Burns E, Ostrer H, Price AL, Reich D (2011) The history of African gene flow into Southern Europeans, Levantines, and Jews. PLoS Genet 7(4):e1001373

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Nunnally JC, Bernstein I (1994) Psychometric Theory (McGraw-Hill Series in Psychology). McGraw-Hill, New York

    Google Scholar 

  • Parham P, Ohta T (1996) Population biology of antigen presentation by MHC class I molecules. Science 272(5258):67–74

    Article  CAS  PubMed  Google Scholar 

  • Paschou P, Drineas P, Lewis J, Nievergelt CM, Nickerson DA, Smith JD, Ridker PM, Chasman DI, Krauss RM, Ziv E (2008) Tracing sub-structure in the European American population with PCA-informative markers. PLoS Genet 4(7):e1000114

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Phillips BL, Callaghan C (2017) The immunology of organ transplantation. Surgery (Oxford)

  • Pickrell JK, Coop G, Novembre J, Kudaravalli S, Li JZ, Absher D, Srinivasan BS, Barsh GS, Myers RM, Feldman MW, Pritchard JK (2009) Signals of recent positive selection in a worldwide sample of human populations. Genome Res 19(5):826–837

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Porras-Hurtado L, Ruiz Y, Santos C, Phillips C, Carracedo Á, Lareu MV (2013) An overview of STRUCTURE: applications, parameter settings, and supporting software. Front Genet 4

  • Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155(2):945–959

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Robinson J, Halliwell JA, Hayhurst JD, Flicek P, Parham P, Marsh SG (2014) The IPD and IMGT/HLA database: allele variant databases. Nucleic Acids Res 43(D1):D423–D431

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Sabeti PC, Varilly P, Fry B, Lohmueller J, Hostetter E, Cotsapas C, Xie X, Byrne EH, McCarroll SA, Gaudet R, Schaffner SF, Lander ES, C. International HapMap, Frazer KA, Ballinger DG, Cox DR, Hinds DA, Stuve LL, Gibbs RA, Belmont JW, Boudreau A, Hardenbol P, Leal SM, Pasternak S, Wheeler DA, Willis TD, Yu F, Yang H, Zeng C, Gao Y, Hu H, Hu W, Li C, Lin W, Liu S, Pan H, Tang X, Wang J, Wang W, Yu J, Zhang B, Zhang Q, Zhao H, Zhao H, Zhou J, Gabriel SB, Barry R, Blumenstiel B, Camargo A, Defelice M, Faggart M, Goyette M, Gupta S, Moore J, Nguyen H, Onofrio RC, Parkin M, Roy J, Stahl E, Winchester E, Ziaugra L, Altshuler D, Shen Y, Yao Z, Huang W, Chu X, He Y, Jin L, Liu Y, Shen Y, Sun W, Wang H, Wang Y, Wang Y, Xiong X, Xu L, Waye MM, Tsui SK, Xue H, Wong JT, Galver LM, Fan JB, Gunderson K, Murray SS, Oliphant AR, Chee MS, Montpetit A, Chagnon F, Ferretti V, Leboeuf M, Olivier JF, Phillips MS, Roumy S, Sallee C, Verner A, Hudson TJ, Kwok PY, Cai D, Koboldt DC, Miller RD, Pawlikowska L, Taillon-Miller P, Xiao M, Tsui LC, Mak W, Song YQ, Tam PK, Nakamura Y, Kawaguchi T, Kitamoto T, Morizono T, Nagashima A, Ohnishi Y, Sekine A, Tanaka T, Tsunoda T, Deloukas P, Bird CP, Delgado M, Dermitzakis ET, Gwilliam R, Hunt S, Morrison J, Powell D, Stranger BE, Whittaker P, Bentley DR, Daly MJ, de Bakker PI, Barrett J, Chretien YR, Maller J, McCarroll S, Patterson N, Pe’er I, Price A, Purcell S, Richter DJ, Sabeti P, Saxena R, Schaffner SF, Sham PC, Varilly P, Altshuler D, Stein LD, Krishnan L, Smith AV, Tello-Ruiz MK, Thorisson GA, Chakravarti A, Chen PE, Cutler DJ, Kashuk CS, Lin S, Abecasis GR, Guan W, Li Y, Munro HM, Qin ZS, Thomas DJ, McVean G, Auton A, Bottolo L, Cardin N, Eyheramendy S, Freeman C, Marchini J, Myers S, Spencer C, Stephens M, Donnelly P, Cardon LR, Clarke G, Evans DM, Morris AP, Weir BS, Tsunoda T, Johnson TA, Mullikin JC, Sherry ST, Feolo M, Skol A, Zhang H, Zeng C, Zhao H, Matsuda I, Fukushima Y, Macer DR, Suda E, Rotimi CN, Adebamowo CA, Ajayi I, Aniagwu T, Marshall PA, Nkwodimmah C, Royal CD, Leppert MF, Dixon M, Peiffer A, Qiu R, Kent A, Kato K, Niikawa N, Adewole IF, Knoppers BM, Foster MW, Clayton EW, Watkin J, Gibbs RA, Belmont JW, Muzny D, Nazareth L, Sodergren E, Weinstock GM, Wheeler DA, Yakub I, Gabriel SB, Onofrio RC, Richter DJ, Ziaugra L, Birren BW, Daly MJ, Altshuler D, Wilson RK, Fulton LL, Rogers J, Burton J, Carter NP, Clee CM, Griffiths M, Jones MC, McLay K, Plumb RW, Ross MT, Sims SK, Willey DL, Chen Z, Han H, Kang L, Godbout M, Wallenburg JC, L’Archeveque P, Bellemare G, Saeki K, Wang H, An D, Fu H, Li Q, Wang Z, Wang R, Holden AL, Brooks LD, McEwen JE, Guyer MS, Wang VO, Peterson JL, Shi M, Spiegel J, Sung LM, Zacharia LF, Collins FS, Kennedy K, Jamieson R, Stewart J (2007) Genome-wide detection and characterization of positive selection in human populations. Nature 449(7164):913–918

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4(4):406–425

    CAS  PubMed  Google Scholar 

  • Sanchez-Mazas A, Thorsby E (2012) HLA in anthropology: the enigma of Easter Island. Clin Transpl:167–173

  • Shiina T, Hosomichi K, Inoko H, Kulski JK (2009) The HLA genomic loci map: expression, interaction, diversity and disease. J Hum Genet 54(1):15–39

    Article  CAS  PubMed  Google Scholar 

  • Simmonds M, Gough S (2007) The HLA region and autoimmune disease: associations and mechanisms of action. Curr Genomics 8(7):453–465

    Article  PubMed  PubMed Central  Google Scholar 

  • Single RM, Meyer D, Hollenbach JA, Nelson MP, Noble JA, Erlich HA, Thomson G (2002) Haplotype frequency estimation in patient populations: the effect of departures from Hardy-Weinberg proportions and collapsing over a locus in the HLA region. Genetic Epidemiology: The Official Publication of the International Genetic Epidemiology Society 22(2):186–195

    Article  Google Scholar 

  • Sjakste T, Kalnina J, Paramonova N, Nikitina-Zake L, Sjakste N (2016) “Journal of Molecular and Genetic Medicine.”

  • Slater N, Louzoun Y, Gragert L, Maiers M, Chatterjee A, Albrecht M (2015) Power laws for heavy-tailed distributions: modeling allele and haplotype diversity for the national marrow donor program. PLoS Comput Biol 11(4):e1004204

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Tokunaga K,Imanishi T, Takahashi K, Juji T (1996) “On the origin and dispersal of East Asian populations as viewed from HLA haplotypes”. Prehistoric mongoloid dispersals: 187-197.

  • Traherne J (2008) Human MHC architecture and evolution: implications for disease association studies. Int J Immunogenet 35(3):179–192

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Verdu P, Pemberton TJ, Laurent R, Kemp BM, Gonzalez-Oliver A, Gorodezky C, Hughes CE, Shattuck MR, Petzelt B, Mitchell J (2014) Patterns of admixture and population structure in native populations of Northwest North America. PLoS Genet 10(8):e1004530

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Voight BF, Kudaravalli S, Wen X, Pritchard JK (2006) A map of recent positive selection in the human genome. PLoS Biol 4(3):e72

    Article  PubMed  PubMed Central  Google Scholar 

  • Wang S, Lewis CM Jr, Jakobsson M, Ramachandran S, Ray N, Bedoya G, Rojas W, Parra MV, Molina JA, Gallo C (2007) Genetic variation and population structure in Native Americans. PLoS Genet 3(11):e185

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Weir BS, Cockerham CC (1984) Estimating F-statistics for the analysis of population structure. evolution:1358–1370

  • Weir BS, Hill WG (2002) Estimating F-statistics. Annu Rev Genet 36(1):721–750

    Article  CAS  PubMed  Google Scholar 

  • Weir BS, Ott J (1997) Genetic data analysis II. Trends Genet 13(9):379

    Article  Google Scholar 

  • Williamson SH, Hubisz MJ, Clark AG, Payseur BA, Bustamante CD, Nielsen R (2007) Localizing recent adaptive evolution in the human genome. PLoS Genet 3(6):e90

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Zhou H, Alexander D, Lange K (2011) A quasi-Newton acceleration for high-dimensional optimization algorithms. Stat Comput 21(2):261–273

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgment

We would like to thank the following registries for allowing their data to be used for this study: National Marrow Donor Program/Be The Match, USA; Ezer Mizion Bone Marrow Donor Registry, Israel; OneMatch Stem Cell and Marrow Network, Canada; Australian Bone Marrow Donor Registry, Australia; Matchis: the Dutch Centre for Stem Cell Donors, The Netherlands; Norwegian Bone Marrow Donor Registry, Norway; New Zealand Bone Marrow Donor Registry, New Zealand; Tobias Registry of Swedish Bone Marrow Donors, Sweden; Thai National Stem Cell Donor Registry, Thailand; Welsh Bone Marrow Donor Registry, Wales, UK.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yoram Louzoun.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Author Summary

Anthropologists frequently use HLA to trace migration and evolution of different populations. This is due to the high linkage among HLA genes leading to the transmission of intact haplotypes from parents to offspring, hence preserving key population ancestral features. Such a linkage has been proposed to be the result of positive selection. However, over the last century, human kind is going through a rapid population mixture, producing admixtures of original populations, and algorithms are required to disentangle these populations for population genetics and for improving matching for stem-cell or solid organ transplants.

To address these challenges, we developed a new HLA-based method to identify population-level admixture using high-resolution HLA haplotype frequencies. This is in contrast with existing methods estimating individual-level admixture based on genomic distribution of SNPs.

We show that 90% of the genetic variance of the HLA haplotypes can be explained using a much-reduced set of 8 source components (SCs). The estimated SCs and admixture models agree with the geographical distribution, population phylogenies, and recent historic admixture events of the studied populations.

Electronic supplementary material

ESM 1

(DOCX 5238 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Simanovsky, A.L., Madbouly, A., Halagan, M. et al. Single haplotype admixture models using large scale HLA genotype frequencies to reproduce human admixture. Immunogenetics 71, 589–604 (2019). https://doi.org/10.1007/s00251-019-01144-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00251-019-01144-7

Keywords

Navigation