Abstract
In this study, we consider admixed populations through their expected heterozygosity, a measure of genetic diversity. A population is termed admixed if its members possess recent ancestry from two or more separate sources. As a result of the fusion of source populations with different genetic variants, admixed populations can exhibit high levels of genetic diversity, reflecting contributions of their multiple ancestral groups. For a model of an admixed population derived from K source populations, we obtain a relationship between its heterozygosity and its proportions of admixture from the various source populations. We show that the heterozygosity of the admixed population is at least as great as that of the least heterozygous source population, and that it potentially exceeds the heterozygosities of all of the source populations. The admixture proportions that maximize the heterozygosity possible for an admixed population formed from a specified set of source populations are also obtained under specific conditions. We examine the special case of \(K=2\) source populations in detail, characterizing the maximal admixture in terms of the heterozygosities of the two source populations and the value of \(F_{ST}\) between them. In this case, the heterozygosity of the admixed population exceeds the maximal heterozygosity of the source groups if the divergence between them, measured by \(F_{ST}\), is large enough, namely above a certain bound that is a function of the heterozygosities of the source groups. We present applications to simulated data as well as to data from human admixture scenarios, providing results useful for interpreting the properties of genetic variability in admixed populations.
Similar content being viewed by others
References
Alcala N, Rosenberg NA (2017) Mathematical constraints on \(F_{ST}\): biallelic markers in arbitrarily many populations. Genetics 206:1581–1600
Alcala N, Rosenberg NA (2019) \(G_{ST}^\prime \), Jost’s \(D\), and \(F_{ST}\) are similarly constrained by allele frequencies: a mathematical, simulation, and empirical study. Mol Ecol 28:1624–1636
Balding DJ, Nichols RA (1995) A method for quantifying differentiation between populations at multi-allelic loci and its implications for investigating identity and paternity. Genetics 96:3–12
Boca SM, Rosenberg NA (2011) Mathematical properties of \(F_{st}\) between admixed populations and their parental source populations. Theor Popul Biol 80:208–216
Buerkle CA, Lexer C (2008) Admixture as the basis for genetic mapping. Trends Ecol Evol 23:686–694
Chakraborty R (1986) Gene admixture in human populations: models and predictions. Yrbk Phys Anthropol 29:1–43
Edge MD, Rosenberg NA (2014) Upper bounds on \(F_{ST}\) in terms of the frequency of the most frequent allele and total homozygosity: the case of a specified number of alleles. Theor Popul Biol 97:20–34
Gravel S (2012) Population genetics models of local ancestry. Genetics 191:607–619
Graybill FA (1976) Theory and application of the linear model. Duxbury, Pacific Grove, CA
Hedrick PW (1999) Perspective: highly variable loci and their interpretation in evolution and conservation. Evolution 53:313–318
Hedrick PW (2005) A standardized genetic differentiation measure. Evolution 59:1633–1638
Horn RA, Johnson CR (2012) Matrix analysis. Cambridge University Press, New York, NY
Huelsenbeck JP, Andolfatto P (2007) Inference of population structure under a Dirichlet process model. Genetics 175:1787–1802
Jakobsson M, Edge MD, Rosenberg NA (2013) The relationship between \(F_{ST}\) and the frequency of the most frequent allele. Genetics 193:515–528
Kotz S, Balakrishnan N, Johnson NL (2000) Continuous multivariate distributions. Volume 1: models and applications. Wiley, New York
Lange K (1997) Mathematical and statistical methods for genetic analysis. Springer, New York
Li JZ, Absher DM, Tang H, Southwick AM, Casto AM, Ramachandran S, Cann HM, Barsh GS, Feldman M, Cavalli-Sforza LL, Myers RM (2008) Worldwide human relationships inferred from genome-wide patterns of variation. Science 319:1100–1104
Long JC (1991) The genetic structure of admixed populations. Genetics 127:417–428
Long JC, Kittles RA (2003) Human genetic diversity and the nonexistence of biological races. Hum Biol 75:449–471
Magnus JR, Neudecker H (2007) Matrix differential calculus with applications in statistics and econometrics, 3rd edn. Wiley, Chichester
Maruki T, Kumar S, Kim Y (2012) Purifying selection modulates the estimates of population differentiation and confounds genome-wide comparisons across single-nucleotide polymorphisms. Mol Biol Evol 29:3617–3623
Mehta RS, Feder AF, Boca SM, Rosenberg NA (2019) The relationship between haplotype-based \(F_{ST}\) and haplotype length. Genetics 213:281–295
Millar RB (1987) Maximum likelihood estimation of mixed stock fishery composition. Can J Fish Aquat Sci 44:583–590
Mooney JA, Huber CD, Service S, Sul JH, Marsden CD, Zhang Z, Sabatti C, Ruiz-Linares A, Bedoya G, Costa Rica/Colombia Consortium for Genetic Investigation of Bipolar Endophenotypes, Freimer N, Lohmueller KE (2018) Understanding the hidden complexity of Latin American population isolates. Am J Hum Genet 103:707–726
Nagylaki T (1998) Fixation indices in subdivided populations. Genetics 148:1325–1332
Pemberton TJ, Absher D, Feldman MW, Myers RM, Rosenberg NA, Li JZ (2012) Genomic patterns of homozygosity in worldwide human populations. Am J Hum Genet 91:275–292
Pemberton TJ, DeGiorgio M, Rosenberg NA (2013) Population structure in a comprehensive genomic data set on human microsatellite variation. G3: Genes Genomes Genet 3:891–907
Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155:945–959
Reddy SB, Rosenberg NA (2012) Refining the relationship between homozygosity and the frequency of the most frequent allele. J Math Biol 64:87–108
Risch N, Choudhry S, Via M, Basu A, Sebro R, Eng C, Beckman K, Thyne S, Chapela R, Rodriguez-Santana JR, Rodriguez-Cintron W, Avila PC, Ziv E, Burchard EG (2009) Ancestry-related assortative mating in Latino populations. Genome Biol 10:R132
Rosenberg NA, Calabrese PP (2004) Polyploid and multilocus extensions of the Wahlund inequality. Theor Popul Biol 66:381–391
Rosenberg NA, Li LM, Ward R, Pritchard JK (2003) Informativeness of genetic markers for inference of ancestry. Am J Hum Genet 73:1402–1422
San Lucas FA, Rosenberg NA, Scheet P (2012) Haploscope: a tool for the graphical display of haplotype structure in populations. Genet Epidemiol 35:17–21
Schroeder KB, Jakobsson M, Crawford MH, Schurr TG, Boca SM, Conrad DF, Tito RY, Osipova LP, Tarskaia LA, Zhadanov SI, Wall JD, Pritchard JK, Malhi RS, Smith DG, Rosenberg NA (2009) Haplotypic background of a private allele at high frequency in the Americas. Mol Biol Evol 26:995–1016
Verdu P, Rosenberg NA (2011) A general mechanistic model for admixture histories of hybrid populations. Genetics 189:1413–1426
Wang S, Ray N, Rojas W, Parra MV, Bedoya G, Gallo C, Poletti G, Mazzotti G, Hill K, Hurtado AM, Camrena B, Nicolini H, Klitz W, Barrantes R, Molina JA, Freimer NB, Bortolini MC, Salzano FM, Petzl-Erler ML, Tsuneto LT, Dipierri JE, Alfaro EL, Bailliet G, Bianchi NO, Llop E, Rothhammer F, Excoffier L, Ruiz-Linares A (2008) Geographic patterns of genome admixture in Latin American Mestizos. PLoS Genet 4:e1000037
Zhu X, Tang H, Risch N (2008) Admixture mapping and the role of population structure for localizing disease genes. Adv Genet 60:547–569
Zou JY, Park DS, Burchard EG, Torgerson DG, Pino-Yanes M, Song YS, Sankararaman S, Halperin E, Zaitlen N (2015) Genetic and socioeconomic study of mate choice in Latinos reveals novel assortment patterns. Proc Natl Acad Sci USA 112:13621–13626
Acknowledgements
Rohan Mehta provided assistance with the SNP data. We thank two reviewers for comments on the manuscript. Support was provided by NIH grant HG005855 and NSF grant BCS-1515127.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1: Proofs for arbitrary K: Theorem 5 and Corollary 6
For the proof of Theorem 5, we first show (i) that \(P'P\) and A are both invertible under the conditions stated in the theorem, and that:
We then (ii) use constrained optimization via Lagrange multipliers to obtain the maximum of \({\underline{\gamma }}'A{\underline{\gamma }}\) subject to \(\underline{1}'{\underline{\gamma }}=1\). This step consists of the first-derivative test to find a stationary point, coupled with the second-derivative test, in Lemma 12, to show that the stationary point defines a local maximum. Finally, we (iii) show that this means that the overall maximum is either at the local maximum \({\underline{\gamma }}^*\) as described in the statement of the theorem or on the boundary of the set \(\{{\underline{\gamma }}: \underline{1}'{\underline{\gamma }} = 1 \text{ and } {\underline{\gamma }} \in \Delta ^{K-1} \}\).
Proof of Theorem 5 (i) Because P is a \(J \times K\) matrix with column rank K, \(K \times K\) matrix \(P'P\) is positive definite. As a positive definite matrix, \(P'P\) is invertible and \((P'P)^{-1}\) is also positive definite (Graybill 1976, pp. 21-22).
To show that \(A=\underline{1} \underline{1}' - P'P\) is invertible, we use the Sherman-Morrison formula for the inverse of a rank-one update of an invertible matrix (Horn and Johnson 2012, pp. 18-19). This formula states that for an invertible square \(n \times n\) matrix X and \(n \times 1\) column vectors \(\underline{y}\) and \(\underline{z}\), \(X+\underline{yz}'\) is invertible if and only if \(1+\underline{z}' X^{-1}\underline{y} \ne 0\), with:
Because we assumed \(\underline{1}'(P'P)^{-1}\underline{1} \ne 1\), the Sherman-Morrison formula applies with \(-(P'P)\) in the role of X, and \(K \times 1\) column vectors \(\underline{1}\) in the role of \(\underline{y}\) and \(\underline{z}\). A has inverse:
Left-multiplying by \(\underline{1}'\) and right-multiplying by \(\underline{1}\), we obtain
Because \((P'P)^{-1}\) is positive definite, \(\underline{1}'(P'P)^{-1}\underline{1} > 0\) by definition, and because \(\underline{1}'(P'P)^{-1}\underline{1} \ne 1\) by assumption, we conclude that \(\frac{1}{\underline{1}'A^{-1}\underline{1}}\) is always defined.
(ii) To maximize \({\underline{\gamma }}' A {\underline{\gamma }}\) subject to \(\underline{1}' {\underline{\gamma }} = 1\), we use Lagrange multipliers. Let \(f({\underline{\gamma }})={\underline{\gamma }}' A {\underline{\gamma }}\), and let \(g({\underline{\gamma }}) = \underline{1}' {\underline{\gamma }}\). The Lagrange function is defined as:
Denoting by \(\underline{0}\) a column vector of zeroes of length K, we solve a system of equations for \({\underline{\gamma }}\) and \(\lambda \),
Equation 21 includes K equations \({\delta \Lambda ({\underline{\gamma }}, \lambda )}/{\delta \gamma _k} = 0\) for \(1 \leqslant k \leqslant K\).
A is symmetric, so we have
For the derivatives of the Lagrange function, we have:
Setting the derivatives with respect to \({\underline{\gamma }}\) to \(\underline{0}\) leads to:
Hence, the solution for \({\underline{\gamma }}\) is:
Because \({\underline{\gamma }}' A {\underline{\gamma }}\) is a differentiable function of \({\underline{\gamma }}\), its maximum on \(\Delta ^{K-1}\) can occur either on the boundary or at a critical point. The following lemma shows that the critical point \({\underline{\gamma }}^* = \frac{A^{-1}\underline{1}}{\underline{1}'A^{-1}\underline{1}}\) is a local maximum.
Lemma 12
The critical point \({\underline{\gamma }}^* = \frac{A^{-1}\underline{1}}{\underline{1}'A^{-1}\underline{1}}\) is a local maximum of \(H_{\mathrm {adm}}\) seen as a function of \({\underline{\gamma }}\) on \(\Delta ^{K-1}\), under the conditions stated in Theorem 5.
Proof
To show that \({\underline{\gamma }}^*\) is a local maximum, we use the second-derivative test for constrained optimization (e.g. Magnus and Neudecker 2007, p. 155). This test considers the bordered Hessian matrix, representing the matrix of second derivatives of the Lagrange function \(\Lambda \) with respect to \(\lambda \) and the components of \({\underline{\gamma }}\):
We must consider the principal minors—determinants of matrices in the upper-left corner—of F. We denote the upper-left corner matrix of size \(r \times r\) of F by \(F_r\), for \(r=2,3,\ldots ,K\). The principal minors are the \(\det (F_r)\). Using the definition of A from Eq. 11, we obtain
A sufficient condition for the critical point to be a local maximum is for \((-1)^r \det (F_r) > 0\) for each r (Magnus and Neudecker 2007, p. 155). We now show that this condition is satisfied.
Using the fact that multiplying a row or column of a matrix by a scalar multiplies the determinant by that scalar, we multiply rows 2 through \(r+1\) by \(-1\) and get
Using the fact that adding a multiple of a row or column to another row does not change the determinant, we add \(-2\) times the first column to each of the remaining columns. We also multiply the first column by \(-1\). We then have
where \(M_r\) is the \(r \times r\) matrix consisting of the upper-left corner of matrix \(P'P\), and \(\underline{1_r}\) is the column vector of length r consisting of 1s.
We now apply a result for the determinant of partitioned matrices (Graybill 1976, pp. 19-20). If W is invertible, then
Applying this result to Eq. 22, we obtain
Because \(P'P\) is positive definite, \(M_r\) is also positive definite. To demonstrate this result, note that because \(\underline{x}'P'P\underline{x} > 0\) for each nonzero column vector \(\underline{x}\), \(\underline{x}'P'P\underline{x} > 0\) for each nonzero \(\underline{x}\) with \(x_k = 0\) for \(k > r\). Because \(M_r\) is positive definite, \(\det (M_r) > 0\) and \(M_r^{-1}\) is also positive definite, leading to \(\underline{1_r}'M_r^{-1}\underline{1_r} > 0\). We conclude
so that the critical point is the location of a local maximum.\(\square \)
Proof
Returning to part (iii) of the proof, following Lemma 12, if \({\underline{\gamma }}^* = \frac{A^{-1}\underline{1}}{\underline{1}'A^{-1}\underline{1}}\) is interior to the simplex \(\Delta ^{K-1}\), then \(H_{\mathrm {adm}}\) is maximal at \({\underline{\gamma }} = {\underline{\gamma }}^*\), with maximum \(H({\underline{\gamma }}) = \frac{1}{\underline{1}'A^{-1}\underline{1}}\). This value is the reciprocal of the sum of the elements of \(A^{-1}\). If \({\underline{\gamma }}^*\) is not interior to \(\Delta ^{K-1}\), then the maximum lies on the boundary of \(\Delta ^{K-1}\).
Finally, we note that \(\gamma ^* = \frac{(P'P)^{-1}\underline{1}}{\underline{1}'(P'P)^{-1}\underline{1}}\) by using Eq. 20.\(\square \)
Proof of Corollary 6
In Theorem 5, the maximum of \(H_{\mathrm {adm}}\) occurs either in the interior of the simplex \(\Delta ^{K-1}\) or on its boundary, \(\{{\underline{\gamma }}: \underline{1}'{\underline{\gamma }} = 1 \text{ and } {\underline{\gamma }} \in \Delta ^{K-1} \}\).
The boundary of the simplex is the union of K faces, which are themselves \((K-2)\)-simplices. If the maximum lies on the boundary of \(\Delta ^{K-1}\), then without loss of generality, we can permute the labels of the source populations so that \(\gamma _K=0\).
We drop column K from matrix P and apply Theorem 5 with this new \(J \times (K-1)\) matrix, \(P_{\{1,\ldots ,K-1\}}\), which has rank \(K-1\). By assumption, \(\underline{1}' (P_{\{1,\ldots ,K-1\}}' P_{\{1,\ldots ,K-1\}})^{-1} \underline{1} \ne 1\).
We then apply Theorem 5 to \(P_{\{1,\ldots ,K-1\}}\). The maximum of \(H_{\mathrm {adm}}\) occurs either at the point \(\gamma _{{\mathcal {S}}}\), where \({\mathcal {S}}=\{1,2,\ldots , K-1\}\), or on the boundary of the set \(\{{\underline{\gamma }}: \underline{1}'{\underline{\gamma }} = 1 \text{ and } {\underline{\gamma }} \in \Delta ^{K-2} \}\).
We repeat this method of descent, decrementing the dimension (and permuting population labels without loss of generality) until we reach the case of only two source populations. A final application of Theorem 5 then finds that \(H_{\mathrm {adm}}\) is maximized either interior to the 1-simplex—the line connecting vertices (1, 0) and (0, 1)—or at one of these vertices. \(\square \)
Appendix 2: Proofs for \(K=2\): Proposition 7 and 11, Corollaries 8–10
Proof of Proposition 7
We maximize the quadratic polynomial in Eqs. 12–14 over \(\gamma \in [0,1]\). The maximum occurs at the unique critical point or on the boundary of the interval.
Setting the derivative of Eq. 14 with respect to \(\gamma _1\) to 0, we find that the critical point is
Because the leading coefficient of Eq. 14 is negative for \(\underline{p_1} \ne \underline{p_2}\), the critical point is a maximum. Hence, if \((C_{12}-H_2 ) / [ 2(C_{12}-H_S) ] \in (0,1)\), then the maximum of \(H_{\mathrm {adm}}\) on the interval [0, 1] lies at \(\gamma _1 = (C_{12}-H_2 ) / [ 2(C_{12}-H_S) ]\). Otherwise, the maximum lies either at \(\gamma _1=0\), in which case it equals \(H_2\), or at \(\gamma _1=1\), in which case it equals \(H_1\).
The conditions describing the location of the maximum can be written in terms of \(H_1\), \(H_2\), and \(C_{12}\). Because the denominator of \(\gamma ^*_1\) in Eq. 23 is always positive for \(\underline{p_1} \ne \underline{p_2}\) (Sect. 4), \(\gamma _1^* \in (0,1)\) becomes equivalent to \(C_{12} > H_1\) and \(C_{12} > H_2\), the former inequality arising from the condition \(\gamma _1^* < 1 \) and the latter from the condition \(\gamma _1^* > 0\).
If the requirement \(C_{12} > H_1\) and \(C_{12} > H_2\) for \(\gamma _1^* \in (0,1)\) fails, then the maximum occurs on the boundary of the unit interval. We have \(H_{\mathrm {adm}}(0)=H_2\) and \(H_{\mathrm {adm}}(1)=H_1\). Thus, the maximum lies at \(\gamma _1=0\) if \(H_2>H_1\) and at \(\gamma _1=1\) if \(H_1>H_2\).
If \(C_{12} > H_1\) and \(C_{12} > H_2\) do not both hold, then one of them must hold, as we showed in Sect. 4 that \(2C_{12} > H_1+H_2\). Combining the fact that either \(C_{12} > H_1\) or \(C_{12} > H_2\) holds with the observation that \(H_2>H_1\) leads to a maximum at \(\gamma _1=0\) and \(H_1>H_2\) leads to a maximum at \(\gamma _1=1\), we complete the characterization of the three cases.
Note that the three cases in the statement of the proposition capture all possible values of \((H_1,H_2,C_{12})\). By the Cauchy–Schwarz inequality, \((1-C_{12})^2 \leqslant (1-H_1)(1-H_2)\), with equality requiring \(\underline{p_1} = \underline{p_2}\). Hence, with \(\underline{p_1} \ne \underline{p_2}\) assumed, either \(1-C_{12} < 1-H_1\) and \(1-C_{12} \geqslant 1-H_2\) (case (ii)), \(1-C_{12} < 1-H_2\) and \(1-C_{12} \geqslant 1-H_1\) (case (iii)), or both \(1-C_{12} < 1-H_1\) and \(1-C_{12} < 1-H_2\) (case (i)).
Alternative expressions in terms of \(H_1\), \(H_2\), and \(F_{12}\) can be derived by noting that \(H_S=\frac{1}{2}(H_1+H_2)\), \(H_1 H_2 = H_S^2 - [(H_1-H_2)/2]^2\) and \(C_{12} = H_S (1+F_{12})/(1-F_{12})\), the latter simply restating Eq. 4 (recalling \(C_{12}=1\) for \(F_{12}=1\)). Thus, we have
Another formulation uses the heterozygosity of a population formed by equal admixture of populations 1 and 2, or \(H_T\). Because \(F_{12}=1-H_S/H_T\) by Eq. 1, \(F_{12}/(1-F_{12}) = (H_T-H_S)/H_S\). Using this relationship in Eqs. 24 and 25,
\(\square \)
Proof of Corollary 8
Suppose \(H_1 \geqslant H_2\). If case (i) from Proposition 7 applies, then because \(H_T > H_S\), \(\gamma ^*_1 \geqslant \frac{1}{2}\). Case (ii) cannot apply because \(H_1 < C_{12}\), \(H_2 \geqslant C_{12}\), and \(H_1 \geqslant H_2\) cannot hold simultaneously. In case (iii), \(\gamma ^*_1 = 1 \geqslant \frac{1}{2}\). For the reverse direction, if \(H_1 < H_2\) and case (i) or case (ii) applies, then \(\gamma ^*_1 < \frac{1}{2}\). Case (iii) cannot apply because \(H_1 \geqslant C_{12}\), \(H_2 < C_{12}\), and \(H_1 < H_2\) cannot hold simultaneously.\(\square \)
Proof of Corollary 9
First, we see that \(H_{\mathrm {adm}}(\gamma _1^*) \geqslant H_T\) in case (i) of Proposition 7. In case (ii), \(H_2 > H_T = (H_1+H_2+2C_{12})/4\) because \(H_2 > H_1\) and \(H_2 \geqslant C_{12}\). In case (iii), \(H_1 > H_T\) because \(H_1 > H_2\) and \(H_1 \geqslant C_{12}\). Note that if \(H_1=H_2\), then case (i) applies, producing \(H_{\mathrm {adm}}(\gamma _1^*)=H_T\).\(\square \)
Proof of Corollary 10
We restate the condition \(0< (C_{12}-H_2)/[2(C_{12}-H_S)] < 1\) as
Subtracting \(\frac{1}{2}\) from both sides and multiplying by 2, an equivalent condition is
or, equivalently, \(| H_1 - H_2 |/{[2\frac{F_{12}}{1-F_{12}}(H_1+H_2)]} < 1\). We rearrange this last expression to obtain the desired result.\(\square \)
Proof of Proposition 11
We apply Proposition 7 with \(J=2\). Substituting \(p_{12}=1-p_{11}\) and \(p_{22}=1-p_{21}\) in Eqs. 15 and 16, we obtain \(C_{12}-H_2 = (p_{11}-p_{21})(1-2p_{21})\), \(C_{12}-H_1 = (p_{21}-p_{11})(1-2p_{11})\), \(C_{12}-H_S = (p_{11}-p_{21})^2\), and \(C_{12}^2-H_1H_2 = (p_{11}-p_{21})^2\). Thus, because \(p_{11}=p_{21}\) is not permitted, the quantities in Eqs. 15 and 16 reduce to those of Eqs. 18 and 19, respectively.
To complete the application of Proposition 7 to \(K=2\), note that case (i) of Proposition 7 occurs when \((p_{11}-p_{21})(1-2p_{21}) > 0\) and \((p_{21}-p_{11})(1-2p_{11}) > 0\). The first of this pair of inequalities requires both \(p_{11} - p_{21} > 0\) and \(1-2p_{21} > 0\), so that \(p_{11} > p_{21}\) and \(\frac{1}{2} > p_{21}\), or both \(p_{11} - p_{21} < 0\) and \(1-2p_{21} < 0\), so that \(p_{11} < p_{21}\) and \(\frac{1}{2} < p_{21}\). The second inequality requires both \(p_{21} - p_{11} > 0\) and \(1-2p_{11} > 0\), so that \(p_{21} > p_{11}\) and \(\frac{1}{2} > p_{11}\), or both \(p_{21} - p_{11} < 0\) and \(1-2p_{11} < 0\), so that \(p_{21} < p_{11}\) and \(\frac{1}{2} < p_{11}\). Thus, the conditions of case (i) of Proposition 7 obtain if and only if \(p_{11}> \frac{1}{2} > p_{21}\) or \(p_{21}> \frac{1}{2} > p_{11}\).
Similarly, using the expressions for \(H_1\), \(H_2\), and \(C_{12}\) when \(K=2\), the conditions of case (ii) of Proposition 7 are equivalent to \(\frac{1}{2} \geqslant p_{21} > p_{11}\) or \(p_{11} > p_{21} \geqslant \frac{1}{2}\). The conditions of case (iii) are equivalent to \(\frac{1}{2} \geqslant p_{11} > p_{21}\) or \(p_{21} > p_{11} \geqslant \frac{1}{2}\).\(\square \)
Appendix 3: Dirichlet model for allele frequencies
We first provide results concerning \(H_{\mathrm {adm}}\) in the case that the K source populations have independently and identically distributed (IID) allele frequency vectors. Next, we specify these IID vectors to be Dirichlet distributions.
1.1 IID allele frequency vectors
We begin by examining the expected values of \(H_k\) and \(H_{\mathrm {adm}}\).
Proposition 13
Suppose the allele frequency vectors \(\underline{p_k}\) are independently and identically distributed for \(1 \leqslant k \leqslant K\). Then \({\mathbb {E}}[H_{\mathrm {adm}}] = {\mathbb {E}}[H_1] + ( 1 - \sum _{k=1}^K \gamma _k^2 ) ( \sum _{j=1}^J \mathrm {Var}[p_{1j}] )\).
Proof
We use Eq. 8:
Using the IID assumption and simplifying by noting that \(1 = (\sum _{k=1}^K \gamma _k )^2 = ( \sum _{k=1}^K \gamma _k^2 ) + (2 \sum _{k=1}^{K-1} \sum _{\ell =k+1}^K \gamma _k \gamma _\ell )\), we have
from which the result follows.\(\square \)
An immediate corollary of Proposition 13 is that \(H_{\mathrm {adm}}\) has expectation greater than or equal to the expectation of the heterozygosity of each of the source populations.
Corollary 14
Suppose the allele frequency vectors \(\underline{p_k}\) are independently and identically distributed for \(1 \leqslant k \leqslant K\). Then \({\mathbb {E}}[H_{\mathrm {adm}}] \geqslant {\mathbb {E}}[H_k]\).
A second corollary results from the Cauchy–Schwarz inequality, by which \(\sum _{k=1}^K \gamma _k^2 \geqslant \frac{1}{K}\), with equality if and only if \((\gamma _1, \gamma _2, \ldots , \gamma _K) = (\frac{1}{K}, \frac{1}{K}, \ldots , \frac{1}{K})\).
Corollary 15
Suppose the allele frequency vectors \(\underline{p_k}\) are independently and identically distributed for \(1 \leqslant k \leqslant K\). Considering all admixture vectors \({\underline{\gamma }} \in \Delta ^{K-1}\), \({\mathbb {E}}[H_{\mathrm {adm}}]\) is maximized at \({\underline{\gamma }} = (\frac{1}{K}, \frac{1}{K}, \ldots , \frac{1}{K})\), and has maximal value \({\mathbb {E}}[H_1] + ( 1 - \frac{1}{K} ) \sum _{j=1}^J \mathrm {Var}[p_{1j}]\).
1.2 IID allele frequency vectors from a symmetric Dirichlet distribution
We now further assume that the independently and identically distributed allele frequency vectors follow a symmetric multivariate Dirichlet distribution. This distribution is frequently used for allele frequency distributions (Balding and Nichols 1995; Pritchard et al. 2000; Huelsenbeck and Andolfatto 2007), and it is a natural probability distribution to assume for allelic types with the same marginal distributions.
The J-dimensional Dirichlet-\((\alpha _1, \alpha _2, \ldots , \alpha _J)\) distribution is defined over the open unit \((J-1)\)-simplex \(\Delta ^{J-1}\) and has concentration parameters \(\alpha _j >0\). The means and variances for the individual allele frequencies are (Lange 1997; Kotz et al. 2000, chapter 49):
where \({\overline{\alpha }} = \frac{1}{J}\sum _{j=1}^J \alpha _j\).
The symmetric Dirichlet distribution assumes \(\alpha _1 = \alpha _2 = \ldots = \alpha _J = {\overline{\alpha }}\), leading to:
Making these substitutions in Proposition 13, we obtain the expectation of \(H_{\mathrm {adm}}\) under the assumption that the allele frequency vectors follow independent Dirichlet distributions.
Corollary 16
Suppose the allele frequency vectors \(\underline{p_k}\) are independently and identically distributed for \(1 \leqslant k \leqslant K\), all with symmetric multivariate Dirichlet distributions with concentration parameter \({\overline{\alpha }}\). Then
This corollary implies that both \({\mathbb {E}}[H_k]\) and \({\mathbb {E}}[H_{\mathrm {adm}}]\) are increasing functions of J and \({\overline{\alpha }}\).
The next proposition considers the special case of \(K=2\) and \(J=2\), further specifying a uniform distribution for \(\gamma _1\).
Proposition 17
Consider \(K=2\) and \(J=2\). Suppose that the values of \(p_{11}\) and \(p_{21}\) are independently chosen from a uniform-[0,1] distribution. Suppose also that \(\gamma _1\) is also chosen from a uniform-[0,1] distribution. Then \({\mathbb {P}}[H_{\mathrm {adm}}(\gamma _1) > \max \{H_1,H_2\}] = 1 - \log 2 \approx 0.307\).
Proof
Using Proposition 11, we identify the regions of the unit square for \((p_{11},p_{21})\) in which \(\max _{\gamma _1 \in (0,1)} H_{\mathrm {adm}}(\gamma _1) > \max \{H_1,H_2\}\). These regions are \(\{ (p_{11},p_{21}) \,|\, \frac{1}{2}< p_{11}< 1, 0< p_{21} < \frac{1}{2} \}\) and \(\{ (p_{11},p_{21}) \,|\, 0< p_{11}< \frac{1}{2}, \frac{1}{2}< p_{21} < 1 \}\).
Within those regions, we must determine the portion of the unit interval for \(\gamma _1\) in which \(H_{\mathrm {adm}}(\gamma _1) > \max \{H_1,H_2\}\). \(H_{\mathrm {adm}}(\gamma _1)\) is a quadratic function of \(\gamma _1\). We ignore the set of zero volume with \(H_1=H_2\). In the regions for \((p_{11},p_{21})\) in which \(\max _{\gamma _1 \in (0,1)} H_{\mathrm {adm}}(\gamma _1) > \max \{H_1,H_2\}\) and \(H_2 > H_1\), the interval for \(\gamma _1\) in which \(H_{\mathrm {adm}}(\gamma _1) > H_1\) is \((0,\frac{1-2p_{21}}{p_{11}-p_{21}})\). In the regions for \((p_{11},p_{21})\) in which \(\max _{\gamma _1 \in (0,1)} H_{\mathrm {adm}}(\gamma _1) > \max \{H_1,H_2\}\) and \(H_1 > H_2\), the interval for \(\gamma _1\) in which \(H_{\mathrm {adm}}(\gamma _1) > H_1\) is \((\frac{p_{21}-1+p_{11}}{p_{21}-p_{11}}, 1)\).
The desired probability is the volume within the unit cube for \((p_{11}, p_{21}, \gamma _1)\) of the regions in which \(H_{\mathrm {adm}}(\gamma _1) > \max \{H_1,H_2\}\). The volume is
\(\square \)
Rights and permissions
About this article
Cite this article
Boca, S.M., Huang, L. & Rosenberg, N.A. On the heterozygosity of an admixed population. J. Math. Biol. 81, 1217–1250 (2020). https://doi.org/10.1007/s00285-020-01531-9
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00285-020-01531-9