Skip to main content
Log in

On the heterozygosity of an admixed population

  • Published:
Journal of Mathematical Biology Aims and scope Submit manuscript

Abstract

In this study, we consider admixed populations through their expected heterozygosity, a measure of genetic diversity. A population is termed admixed if its members possess recent ancestry from two or more separate sources. As a result of the fusion of source populations with different genetic variants, admixed populations can exhibit high levels of genetic diversity, reflecting contributions of their multiple ancestral groups. For a model of an admixed population derived from K source populations, we obtain a relationship between its heterozygosity and its proportions of admixture from the various source populations. We show that the heterozygosity of the admixed population is at least as great as that of the least heterozygous source population, and that it potentially exceeds the heterozygosities of all of the source populations. The admixture proportions that maximize the heterozygosity possible for an admixed population formed from a specified set of source populations are also obtained under specific conditions. We examine the special case of \(K=2\) source populations in detail, characterizing the maximal admixture in terms of the heterozygosities of the two source populations and the value of \(F_{ST}\) between them. In this case, the heterozygosity of the admixed population exceeds the maximal heterozygosity of the source groups if the divergence between them, measured by \(F_{ST}\), is large enough, namely above a certain bound that is a function of the heterozygosities of the source groups. We present applications to simulated data as well as to data from human admixture scenarios, providing results useful for interpreting the properties of genetic variability in admixed populations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Alcala N, Rosenberg NA (2017) Mathematical constraints on \(F_{ST}\): biallelic markers in arbitrarily many populations. Genetics 206:1581–1600

    Article  Google Scholar 

  • Alcala N, Rosenberg NA (2019) \(G_{ST}^\prime \), Jost’s \(D\), and \(F_{ST}\) are similarly constrained by allele frequencies: a mathematical, simulation, and empirical study. Mol Ecol 28:1624–1636

    Article  Google Scholar 

  • Balding DJ, Nichols RA (1995) A method for quantifying differentiation between populations at multi-allelic loci and its implications for investigating identity and paternity. Genetics 96:3–12

    Google Scholar 

  • Boca SM, Rosenberg NA (2011) Mathematical properties of \(F_{st}\) between admixed populations and their parental source populations. Theor Popul Biol 80:208–216

    Article  Google Scholar 

  • Buerkle CA, Lexer C (2008) Admixture as the basis for genetic mapping. Trends Ecol Evol 23:686–694

    Article  Google Scholar 

  • Chakraborty R (1986) Gene admixture in human populations: models and predictions. Yrbk Phys Anthropol 29:1–43

    Article  Google Scholar 

  • Edge MD, Rosenberg NA (2014) Upper bounds on \(F_{ST}\) in terms of the frequency of the most frequent allele and total homozygosity: the case of a specified number of alleles. Theor Popul Biol 97:20–34

    Article  Google Scholar 

  • Gravel S (2012) Population genetics models of local ancestry. Genetics 191:607–619

    Article  Google Scholar 

  • Graybill FA (1976) Theory and application of the linear model. Duxbury, Pacific Grove, CA

    MATH  Google Scholar 

  • Hedrick PW (1999) Perspective: highly variable loci and their interpretation in evolution and conservation. Evolution 53:313–318

    Article  Google Scholar 

  • Hedrick PW (2005) A standardized genetic differentiation measure. Evolution 59:1633–1638

    Article  Google Scholar 

  • Horn RA, Johnson CR (2012) Matrix analysis. Cambridge University Press, New York, NY

    Book  Google Scholar 

  • Huelsenbeck JP, Andolfatto P (2007) Inference of population structure under a Dirichlet process model. Genetics 175:1787–1802

    Article  Google Scholar 

  • Jakobsson M, Edge MD, Rosenberg NA (2013) The relationship between \(F_{ST}\) and the frequency of the most frequent allele. Genetics 193:515–528

    Article  Google Scholar 

  • Kotz S, Balakrishnan N, Johnson NL (2000) Continuous multivariate distributions. Volume 1: models and applications. Wiley, New York

    Book  Google Scholar 

  • Lange K (1997) Mathematical and statistical methods for genetic analysis. Springer, New York

    Book  Google Scholar 

  • Li JZ, Absher DM, Tang H, Southwick AM, Casto AM, Ramachandran S, Cann HM, Barsh GS, Feldman M, Cavalli-Sforza LL, Myers RM (2008) Worldwide human relationships inferred from genome-wide patterns of variation. Science 319:1100–1104

    Article  Google Scholar 

  • Long JC (1991) The genetic structure of admixed populations. Genetics 127:417–428

    Google Scholar 

  • Long JC, Kittles RA (2003) Human genetic diversity and the nonexistence of biological races. Hum Biol 75:449–471

    Article  Google Scholar 

  • Magnus JR, Neudecker H (2007) Matrix differential calculus with applications in statistics and econometrics, 3rd edn. Wiley, Chichester

    MATH  Google Scholar 

  • Maruki T, Kumar S, Kim Y (2012) Purifying selection modulates the estimates of population differentiation and confounds genome-wide comparisons across single-nucleotide polymorphisms. Mol Biol Evol 29:3617–3623

    Article  Google Scholar 

  • Mehta RS, Feder AF, Boca SM, Rosenberg NA (2019) The relationship between haplotype-based \(F_{ST}\) and haplotype length. Genetics 213:281–295

    Google Scholar 

  • Millar RB (1987) Maximum likelihood estimation of mixed stock fishery composition. Can J Fish Aquat Sci 44:583–590

    Article  Google Scholar 

  • Mooney JA, Huber CD, Service S, Sul JH, Marsden CD, Zhang Z, Sabatti C, Ruiz-Linares A, Bedoya G, Costa Rica/Colombia Consortium for Genetic Investigation of Bipolar Endophenotypes, Freimer N, Lohmueller KE (2018) Understanding the hidden complexity of Latin American population isolates. Am J Hum Genet 103:707–726

  • Nagylaki T (1998) Fixation indices in subdivided populations. Genetics 148:1325–1332

    Google Scholar 

  • Pemberton TJ, Absher D, Feldman MW, Myers RM, Rosenberg NA, Li JZ (2012) Genomic patterns of homozygosity in worldwide human populations. Am J Hum Genet 91:275–292

    Article  Google Scholar 

  • Pemberton TJ, DeGiorgio M, Rosenberg NA (2013) Population structure in a comprehensive genomic data set on human microsatellite variation. G3: Genes Genomes Genet 3:891–907

    Article  Google Scholar 

  • Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155:945–959

    Google Scholar 

  • Reddy SB, Rosenberg NA (2012) Refining the relationship between homozygosity and the frequency of the most frequent allele. J Math Biol 64:87–108

    Article  MathSciNet  Google Scholar 

  • Risch N, Choudhry S, Via M, Basu A, Sebro R, Eng C, Beckman K, Thyne S, Chapela R, Rodriguez-Santana JR, Rodriguez-Cintron W, Avila PC, Ziv E, Burchard EG (2009) Ancestry-related assortative mating in Latino populations. Genome Biol 10:R132

    Article  Google Scholar 

  • Rosenberg NA, Calabrese PP (2004) Polyploid and multilocus extensions of the Wahlund inequality. Theor Popul Biol 66:381–391

    Article  Google Scholar 

  • Rosenberg NA, Li LM, Ward R, Pritchard JK (2003) Informativeness of genetic markers for inference of ancestry. Am J Hum Genet 73:1402–1422

    Article  Google Scholar 

  • San Lucas FA, Rosenberg NA, Scheet P (2012) Haploscope: a tool for the graphical display of haplotype structure in populations. Genet Epidemiol 35:17–21

    Article  Google Scholar 

  • Schroeder KB, Jakobsson M, Crawford MH, Schurr TG, Boca SM, Conrad DF, Tito RY, Osipova LP, Tarskaia LA, Zhadanov SI, Wall JD, Pritchard JK, Malhi RS, Smith DG, Rosenberg NA (2009) Haplotypic background of a private allele at high frequency in the Americas. Mol Biol Evol 26:995–1016

    Article  Google Scholar 

  • Verdu P, Rosenberg NA (2011) A general mechanistic model for admixture histories of hybrid populations. Genetics 189:1413–1426

    Article  Google Scholar 

  • Wang S, Ray N, Rojas W, Parra MV, Bedoya G, Gallo C, Poletti G, Mazzotti G, Hill K, Hurtado AM, Camrena B, Nicolini H, Klitz W, Barrantes R, Molina JA, Freimer NB, Bortolini MC, Salzano FM, Petzl-Erler ML, Tsuneto LT, Dipierri JE, Alfaro EL, Bailliet G, Bianchi NO, Llop E, Rothhammer F, Excoffier L, Ruiz-Linares A (2008) Geographic patterns of genome admixture in Latin American Mestizos. PLoS Genet 4:e1000037

    Article  Google Scholar 

  • Zhu X, Tang H, Risch N (2008) Admixture mapping and the role of population structure for localizing disease genes. Adv Genet 60:547–569

    Article  Google Scholar 

  • Zou JY, Park DS, Burchard EG, Torgerson DG, Pino-Yanes M, Song YS, Sankararaman S, Halperin E, Zaitlen N (2015) Genetic and socioeconomic study of mate choice in Latinos reveals novel assortment patterns. Proc Natl Acad Sci USA 112:13621–13626

    Article  Google Scholar 

Download references

Acknowledgements

Rohan Mehta provided assistance with the SNP data. We thank two reviewers for comments on the manuscript. Support was provided by NIH grant HG005855 and NSF grant BCS-1515127.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Simina M. Boca.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1: Proofs for arbitrary K: Theorem 5 and Corollary 6

For the proof of Theorem 5, we first show (i) that \(P'P\) and A are both invertible under the conditions stated in the theorem, and that:

$$\begin{aligned} \frac{1}{\underline{1}'A^{-1}\underline{1}} = 1- \frac{1}{\underline{1}'(P'P)^{-1}\underline{1}}. \end{aligned}$$

We then (ii) use constrained optimization via Lagrange multipliers to obtain the maximum of \({\underline{\gamma }}'A{\underline{\gamma }}\) subject to \(\underline{1}'{\underline{\gamma }}=1\). This step consists of the first-derivative test to find a stationary point, coupled with the second-derivative test, in Lemma 12, to show that the stationary point defines a local maximum. Finally, we (iii) show that this means that the overall maximum is either at the local maximum \({\underline{\gamma }}^*\) as described in the statement of the theorem or on the boundary of the set \(\{{\underline{\gamma }}: \underline{1}'{\underline{\gamma }} = 1 \text{ and } {\underline{\gamma }} \in \Delta ^{K-1} \}\).

Proof of Theorem 5 (i) Because P is a \(J \times K\) matrix with column rank K, \(K \times K\) matrix \(P'P\) is positive definite. As a positive definite matrix, \(P'P\) is invertible and \((P'P)^{-1}\) is also positive definite (Graybill 1976, pp. 21-22).

To show that \(A=\underline{1} \underline{1}' - P'P\) is invertible, we use the Sherman-Morrison formula for the inverse of a rank-one update of an invertible matrix (Horn and Johnson 2012, pp. 18-19). This formula states that for an invertible square \(n \times n\) matrix X and \(n \times 1\) column vectors \(\underline{y}\) and \(\underline{z}\), \(X+\underline{yz}'\) is invertible if and only if \(1+\underline{z}' X^{-1}\underline{y} \ne 0\), with:

$$\begin{aligned} (X + \underline{yz}')^{-1} = X^{-1} - \frac{ X^{-1}\underline{yz}' X^{-1}}{1+\underline{z}' X^{-1}\underline{y}}. \end{aligned}$$

Because we assumed \(\underline{1}'(P'P)^{-1}\underline{1} \ne 1\), the Sherman-Morrison formula applies with \(-(P'P)\) in the role of X, and \(K \times 1\) column vectors \(\underline{1}\) in the role of \(\underline{y}\) and \(\underline{z}\). A has inverse:

$$\begin{aligned} A^{-1} = \frac{(P'P)^{-1}\underline{11}'(P'P)^{-1}}{\underline{1}'(P'P)^{-1}\underline{1}-1}-(P'P)^{-1}. \end{aligned}$$
(20)

Left-multiplying by \(\underline{1}'\) and right-multiplying by \(\underline{1}\), we obtain

$$\begin{aligned} \frac{1}{\underline{1}'A^{-1}\underline{1}} = 1- \frac{1}{\underline{1}'(P'P)^{-1}\underline{1}}. \end{aligned}$$

Because \((P'P)^{-1}\) is positive definite, \(\underline{1}'(P'P)^{-1}\underline{1} > 0\) by definition, and because \(\underline{1}'(P'P)^{-1}\underline{1} \ne 1\) by assumption, we conclude that \(\frac{1}{\underline{1}'A^{-1}\underline{1}}\) is always defined.

(ii) To maximize \({\underline{\gamma }}' A {\underline{\gamma }}\) subject to \(\underline{1}' {\underline{\gamma }} = 1\), we use Lagrange multipliers. Let \(f({\underline{\gamma }})={\underline{\gamma }}' A {\underline{\gamma }}\), and let \(g({\underline{\gamma }}) = \underline{1}' {\underline{\gamma }}\). The Lagrange function is defined as:

$$\begin{aligned} \Lambda ({\underline{\gamma }}, \lambda ) = f({\underline{\gamma }}) + \lambda [g({\underline{\gamma }})-1]. \end{aligned}$$

Denoting by \(\underline{0}\) a column vector of zeroes of length K, we solve a system of equations for \({\underline{\gamma }}\) and \(\lambda \),

$$\begin{aligned} \bigg ( \frac{\delta \Lambda ({\underline{\gamma }}, \lambda )}{\delta {\underline{\gamma }}}, \frac{\delta \Lambda ({\underline{\gamma }}, \lambda )}{\delta \lambda } \bigg ) = (\underline{0}, 0). \end{aligned}$$
(21)

Equation 21 includes K equations \({\delta \Lambda ({\underline{\gamma }}, \lambda )}/{\delta \gamma _k} = 0\) for \(1 \leqslant k \leqslant K\).

A is symmetric, so we have

$$\begin{aligned} \frac{\delta f({\underline{\gamma }})}{\delta {\underline{\gamma }}}= & {} \frac{\delta ({\underline{\gamma }}'A{\underline{\gamma }})}{\delta {\underline{\gamma }}} = (A+A'){\underline{\gamma }} = 2A {\underline{\gamma }} \\ \frac{\delta g({\underline{\gamma }})}{\delta {\underline{\gamma }}}= & {} \underline{1}. \end{aligned}$$

For the derivatives of the Lagrange function, we have:

$$\begin{aligned} \bigg ( \frac{\delta \Lambda ({\underline{\gamma }}, \lambda )}{\delta {\underline{\gamma }}}, \frac{\delta \Lambda ({\underline{\gamma }}, \lambda )}{\delta \lambda } \bigg ) = (2A{\underline{\gamma }} + \lambda \underline{1}, \underline{1}'{\underline{\gamma }} - 1). \end{aligned}$$

Setting the derivatives with respect to \({\underline{\gamma }}\) to \(\underline{0}\) leads to:

$$\begin{aligned} ({\underline{\gamma }}, \lambda ) = \bigg ( -\frac{\underline{\lambda }}{2}A^{-1}\underline{1}, -\frac{2}{\underline{1}'A^{-1}\underline{1}} \bigg ). \end{aligned}$$

Hence, the solution for \({\underline{\gamma }}\) is:

$$\begin{aligned} {\underline{\gamma }}^*= & {} \frac{A^{-1}\underline{1}}{\underline{1}'A^{-1}\underline{1}}. \end{aligned}$$

Because \({\underline{\gamma }}' A {\underline{\gamma }}\) is a differentiable function of \({\underline{\gamma }}\), its maximum on \(\Delta ^{K-1}\) can occur either on the boundary or at a critical point. The following lemma shows that the critical point \({\underline{\gamma }}^* = \frac{A^{-1}\underline{1}}{\underline{1}'A^{-1}\underline{1}}\) is a local maximum.

Lemma 12

The critical point \({\underline{\gamma }}^* = \frac{A^{-1}\underline{1}}{\underline{1}'A^{-1}\underline{1}}\) is a local maximum of \(H_{\mathrm {adm}}\) seen as a function of \({\underline{\gamma }}\) on \(\Delta ^{K-1}\), under the conditions stated in Theorem 5.

Proof

To show that \({\underline{\gamma }}^*\) is a local maximum, we use the second-derivative test for constrained optimization (e.g. Magnus and Neudecker 2007, p. 155). This test considers the bordered Hessian matrix, representing the matrix of second derivatives of the Lagrange function \(\Lambda \) with respect to \(\lambda \) and the components of \({\underline{\gamma }}\):

$$\begin{aligned} F = \begin{pmatrix} \frac{ \delta ^2 \Lambda }{\delta \lambda ^2} &{} \left( \frac{ \delta ^2 \Lambda }{\delta {\underline{\gamma }} \, \delta \lambda } \right) ' \\ \frac{\delta ^2 \Lambda }{\delta {\underline{\gamma }} \, \delta \lambda } &{} \frac{\delta ^2 \Lambda }{\delta {\underline{\gamma }}^2} \\ \end{pmatrix} = \begin{pmatrix} 0 &{} \left( \frac{ \delta g}{\delta {\underline{\gamma }}} \right) ' \\ \frac{\delta g}{\delta {\underline{\gamma }}} &{} \frac{\delta ^2 \Lambda }{\delta {\underline{\gamma }}^2} \\ \end{pmatrix} = \begin{pmatrix} 0 &{} \underline{1}' \\ \underline{1} &{} 2A \\ \end{pmatrix}. \end{aligned}$$

We must consider the principal minors—determinants of matrices in the upper-left corner—of F. We denote the upper-left corner matrix of size \(r \times r\) of F by \(F_r\), for \(r=2,3,\ldots ,K\). The principal minors are the \(\det (F_r)\). Using the definition of A from Eq. 11, we obtain

$$\begin{aligned} F_r = \begin{pmatrix} 0 &{}\quad 1 &{}\quad 1 &{}\quad \ldots &{}\quad 1\\ 1 &{}\quad 2 H_1 &{}\quad 2C_{12} &{}\quad \ldots &{}\quad 2C_{1r}\\ 1 &{}\quad 2C_{12} &{}\quad 2 H_2 &{}\quad \ldots &{}\quad 2C_{2r} \\ \vdots &{}\quad \vdots &{}\quad \vdots &{}\quad \vdots &{}\quad \vdots \\ 1 &{}\quad 2C_{1r} &{}\quad 2C_{2r} &{}\quad \ldots &{}\quad 2 H_r \\ \end{pmatrix}. \end{aligned}$$

A sufficient condition for the critical point to be a local maximum is for \((-1)^r \det (F_r) > 0\) for each r (Magnus and Neudecker 2007, p. 155). We now show that this condition is satisfied.

Using the fact that multiplying a row or column of a matrix by a scalar multiplies the determinant by that scalar, we multiply rows 2 through \(r+1\) by \(-1\) and get

$$\begin{aligned} \det (F_r)= & {} \det \begin{pmatrix} 0 &{}\quad 1 &{}\quad 1 &{}\quad \ldots &{}\quad 1\\ 1 &{}\quad 2 H_1 &{}\quad 2C_{12} &{}\quad \ldots &{}\quad 2C_{1r} \\ 1 &{}\quad 2C_{12} &{}\quad 2 H_2 &{}\quad \ldots &{}\quad 2C_{2r} \\ \vdots &{}\quad \vdots &{}\quad \vdots &{}\quad \vdots &{}\quad \vdots \\ 1 &{}\quad 2C_{1r} &{}\quad 2C_{2r} &{}\quad \ldots &{}\quad 2 H_r \\ \end{pmatrix}\\= & {} (-1)^r \det \begin{pmatrix} 0 &{}\quad 1 &{}\quad 1 &{}\quad \ldots &{}\quad 1\\ - 1 &{}\quad -2H_1 &{}\quad -2C_{12} &{}\quad \ldots &{}\quad -2C_{1r}\\ -1 &{}\quad -2C_{12} &{}\quad -2H_2 &{}\quad \ldots &{}\quad -2C_{2r}\\ \vdots &{}\quad \vdots &{}\quad \vdots &{}\quad \vdots &{}\quad \ldots \\ -1 &{}\quad -2C_{1r} &{}\quad -2C_{2r} &{}\quad \ldots &{} -2 H_r \\ \end{pmatrix}. \end{aligned}$$

Using the fact that adding a multiple of a row or column to another row does not change the determinant, we add \(-2\) times the first column to each of the remaining columns. We also multiply the first column by \(-1\). We then have

$$\begin{aligned} (-1)^r \det (F_r) = (-1)^{2r+1} \det \begin{pmatrix} 0 &{} \underline{1_r}' \\ \underline{1_r} &{} 2 M_r \\ \end{pmatrix} = - \det \begin{pmatrix} 0 &{} \underline{1_r}' \\ \underline{1_r} &{} 2 M_r \\ \end{pmatrix}, \end{aligned}$$
(22)

where \(M_r\) is the \(r \times r\) matrix consisting of the upper-left corner of matrix \(P'P\), and \(\underline{1_r}\) is the column vector of length r consisting of 1s.

We now apply a result for the determinant of partitioned matrices (Graybill 1976, pp. 19-20). If W is invertible, then

$$\begin{aligned} \det \begin{pmatrix} X &{} Y \\ Z &{} W \end{pmatrix} = \det (W) \det (X-YW^{-1}Z). \end{aligned}$$

Applying this result to Eq. 22, we obtain

$$\begin{aligned} (-1)^r \det (F_r)= & {} -\det (2M_r) \det (-\underline{1_r}'(2M_r)^{-1}\underline{1_r}) \\= & {} - [2^r \det (M_r) ] \bigg [\bigg (-\frac{1}{2}\bigg )\underline{1_r}'M_r^{-1}\underline{1_r}\bigg ]\\= & {} 2^{r-1} \det (M_r) \, (\underline{1_r}'M_r^{-1}\underline{1_r}). \end{aligned}$$

Because \(P'P\) is positive definite, \(M_r\) is also positive definite. To demonstrate this result, note that because \(\underline{x}'P'P\underline{x} > 0\) for each nonzero column vector \(\underline{x}\), \(\underline{x}'P'P\underline{x} > 0\) for each nonzero \(\underline{x}\) with \(x_k = 0\) for \(k > r\). Because \(M_r\) is positive definite, \(\det (M_r) > 0\) and \(M_r^{-1}\) is also positive definite, leading to \(\underline{1_r}'M_r^{-1}\underline{1_r} > 0\). We conclude

$$\begin{aligned} (-1)^r \det (F_r) > 0, \end{aligned}$$

so that the critical point is the location of a local maximum.\(\square \)

Proof

Returning to part (iii) of the proof, following Lemma 12, if \({\underline{\gamma }}^* = \frac{A^{-1}\underline{1}}{\underline{1}'A^{-1}\underline{1}}\) is interior to the simplex \(\Delta ^{K-1}\), then \(H_{\mathrm {adm}}\) is maximal at \({\underline{\gamma }} = {\underline{\gamma }}^*\), with maximum \(H({\underline{\gamma }}) = \frac{1}{\underline{1}'A^{-1}\underline{1}}\). This value is the reciprocal of the sum of the elements of \(A^{-1}\). If \({\underline{\gamma }}^*\) is not interior to \(\Delta ^{K-1}\), then the maximum lies on the boundary of \(\Delta ^{K-1}\).

Finally, we note that \(\gamma ^* = \frac{(P'P)^{-1}\underline{1}}{\underline{1}'(P'P)^{-1}\underline{1}}\) by using Eq. 20.\(\square \)

Proof of Corollary 6

In Theorem 5, the maximum of \(H_{\mathrm {adm}}\) occurs either in the interior of the simplex \(\Delta ^{K-1}\) or on its boundary, \(\{{\underline{\gamma }}: \underline{1}'{\underline{\gamma }} = 1 \text{ and } {\underline{\gamma }} \in \Delta ^{K-1} \}\).

The boundary of the simplex is the union of K faces, which are themselves \((K-2)\)-simplices. If the maximum lies on the boundary of \(\Delta ^{K-1}\), then without loss of generality, we can permute the labels of the source populations so that \(\gamma _K=0\).

We drop column K from matrix P and apply Theorem 5 with this new \(J \times (K-1)\) matrix, \(P_{\{1,\ldots ,K-1\}}\), which has rank \(K-1\). By assumption, \(\underline{1}' (P_{\{1,\ldots ,K-1\}}' P_{\{1,\ldots ,K-1\}})^{-1} \underline{1} \ne 1\).

We then apply Theorem 5 to \(P_{\{1,\ldots ,K-1\}}\). The maximum of \(H_{\mathrm {adm}}\) occurs either at the point \(\gamma _{{\mathcal {S}}}\), where \({\mathcal {S}}=\{1,2,\ldots , K-1\}\), or on the boundary of the set \(\{{\underline{\gamma }}: \underline{1}'{\underline{\gamma }} = 1 \text{ and } {\underline{\gamma }} \in \Delta ^{K-2} \}\).

We repeat this method of descent, decrementing the dimension (and permuting population labels without loss of generality) until we reach the case of only two source populations. A final application of Theorem 5 then finds that \(H_{\mathrm {adm}}\) is maximized either interior to the 1-simplex—the line connecting vertices (1, 0) and (0, 1)—or at one of these vertices. \(\square \)

Appendix 2: Proofs for \(K=2\): Proposition 7 and 11, Corollaries 810

Proof of Proposition 7

We maximize the quadratic polynomial in Eqs. 1214 over \(\gamma \in [0,1]\). The maximum occurs at the unique critical point or on the boundary of the interval.

Setting the derivative of Eq. 14 with respect to \(\gamma _1\) to 0, we find that the critical point is

$$\begin{aligned} (\gamma _1^*, H_{\mathrm {adm}}) = \bigg ( \frac{C_{12}-H_2}{ 2(C_{12}-H_S) }, \frac{C_{12}^2-H_1H_2}{2(C_{12}-H_S)} \bigg ). \end{aligned}$$
(23)

Because the leading coefficient of Eq. 14 is negative for \(\underline{p_1} \ne \underline{p_2}\), the critical point is a maximum. Hence, if \((C_{12}-H_2 ) / [ 2(C_{12}-H_S) ] \in (0,1)\), then the maximum of \(H_{\mathrm {adm}}\) on the interval [0, 1] lies at \(\gamma _1 = (C_{12}-H_2 ) / [ 2(C_{12}-H_S) ]\). Otherwise, the maximum lies either at \(\gamma _1=0\), in which case it equals \(H_2\), or at \(\gamma _1=1\), in which case it equals \(H_1\).

The conditions describing the location of the maximum can be written in terms of \(H_1\), \(H_2\), and \(C_{12}\). Because the denominator of \(\gamma ^*_1\) in Eq. 23 is always positive for \(\underline{p_1} \ne \underline{p_2}\) (Sect. 4), \(\gamma _1^* \in (0,1)\) becomes equivalent to \(C_{12} > H_1\) and \(C_{12} > H_2\), the former inequality arising from the condition \(\gamma _1^* < 1 \) and the latter from the condition \(\gamma _1^* > 0\).

If the requirement \(C_{12} > H_1\) and \(C_{12} > H_2\) for \(\gamma _1^* \in (0,1)\) fails, then the maximum occurs on the boundary of the unit interval. We have \(H_{\mathrm {adm}}(0)=H_2\) and \(H_{\mathrm {adm}}(1)=H_1\). Thus, the maximum lies at \(\gamma _1=0\) if \(H_2>H_1\) and at \(\gamma _1=1\) if \(H_1>H_2\).

If \(C_{12} > H_1\) and \(C_{12} > H_2\) do not both hold, then one of them must hold, as we showed in Sect. 4 that \(2C_{12} > H_1+H_2\). Combining the fact that either \(C_{12} > H_1\) or \(C_{12} > H_2\) holds with the observation that \(H_2>H_1\) leads to a maximum at \(\gamma _1=0\) and \(H_1>H_2\) leads to a maximum at \(\gamma _1=1\), we complete the characterization of the three cases.

Note that the three cases in the statement of the proposition capture all possible values of \((H_1,H_2,C_{12})\). By the Cauchy–Schwarz inequality, \((1-C_{12})^2 \leqslant (1-H_1)(1-H_2)\), with equality requiring \(\underline{p_1} = \underline{p_2}\). Hence, with \(\underline{p_1} \ne \underline{p_2}\) assumed, either \(1-C_{12} < 1-H_1\) and \(1-C_{12} \geqslant 1-H_2\) (case (ii)), \(1-C_{12} < 1-H_2\) and \(1-C_{12} \geqslant 1-H_1\) (case (iii)), or both \(1-C_{12} < 1-H_1\) and \(1-C_{12} < 1-H_2\) (case (i)).

Alternative expressions in terms of \(H_1\), \(H_2\), and \(F_{12}\) can be derived by noting that \(H_S=\frac{1}{2}(H_1+H_2)\), \(H_1 H_2 = H_S^2 - [(H_1-H_2)/2]^2\) and \(C_{12} = H_S (1+F_{12})/(1-F_{12})\), the latter simply restating Eq. 4 (recalling \(C_{12}=1\) for \(F_{12}=1\)). Thus, we have

$$\begin{aligned} \gamma ^*_1= & {} \frac{C_{12}-H_2}{2(C_{12}-H_S)} = \frac{1}{2}+ \frac{H_1 - H_2}{4\frac{F_{12}}{1-F_{12}}(H_1+H_2)} \end{aligned}$$
(24)
$$\begin{aligned} H_{\mathrm {adm}}(\gamma ^*)= & {} \frac{C_{12}^2 - H_1 H_2}{2(C_{12}-H_S)} = \frac{H_1+H_2}{2(1-F_{12})} + \frac{(H_1-H_2)^2}{ 8\frac{F_{12}}{1-F_{12}}(H_1+H_2)}. \end{aligned}$$
(25)

Another formulation uses the heterozygosity of a population formed by equal admixture of populations 1 and 2, or \(H_T\). Because \(F_{12}=1-H_S/H_T\) by Eq. 1, \(F_{12}/(1-F_{12}) = (H_T-H_S)/H_S\). Using this relationship in Eqs. 24 and 25,

$$\begin{aligned} \gamma ^*_1= & {} \frac{1}{2} + \frac{H_1-H_2}{8(H_T - H_S)} \\ H_{\mathrm {adm}}(\gamma ^*)= & {} H_T + \frac{(H_1-H_2)^2}{16(H_T-H_S)}. \end{aligned}$$

\(\square \)

Proof of Corollary 8

Suppose \(H_1 \geqslant H_2\). If case (i) from Proposition 7 applies, then because \(H_T > H_S\), \(\gamma ^*_1 \geqslant \frac{1}{2}\). Case (ii) cannot apply because \(H_1 < C_{12}\), \(H_2 \geqslant C_{12}\), and \(H_1 \geqslant H_2\) cannot hold simultaneously. In case (iii), \(\gamma ^*_1 = 1 \geqslant \frac{1}{2}\). For the reverse direction, if \(H_1 < H_2\) and case (i) or case (ii) applies, then \(\gamma ^*_1 < \frac{1}{2}\). Case (iii) cannot apply because \(H_1 \geqslant C_{12}\), \(H_2 < C_{12}\), and \(H_1 < H_2\) cannot hold simultaneously.\(\square \)

Proof of Corollary 9

First, we see that \(H_{\mathrm {adm}}(\gamma _1^*) \geqslant H_T\) in case (i) of Proposition 7. In case (ii), \(H_2 > H_T = (H_1+H_2+2C_{12})/4\) because \(H_2 > H_1\) and \(H_2 \geqslant C_{12}\). In case (iii), \(H_1 > H_T\) because \(H_1 > H_2\) and \(H_1 \geqslant C_{12}\). Note that if \(H_1=H_2\), then case (i) applies, producing \(H_{\mathrm {adm}}(\gamma _1^*)=H_T\).\(\square \)

Proof of Corollary 10

We restate the condition \(0< (C_{12}-H_2)/[2(C_{12}-H_S)] < 1\) as

$$\begin{aligned} 0< \frac{1}{2}+ \frac{\left( \frac{H_1 - H_2}{2}\right) }{2\frac{F_{12}}{1-F_{12}}(H_1+H_2)} < 1. \end{aligned}$$

Subtracting \(\frac{1}{2}\) from both sides and multiplying by 2, an equivalent condition is

$$\begin{aligned} -1< \frac{ ( H_1 - H_2 )}{2\frac{F_{12}}{1-F_{12}}(H_1+H_2)} < 1, \end{aligned}$$

or, equivalently, \(| H_1 - H_2 |/{[2\frac{F_{12}}{1-F_{12}}(H_1+H_2)]} < 1\). We rearrange this last expression to obtain the desired result.\(\square \)

Proof of Proposition 11

We apply Proposition 7 with \(J=2\). Substituting \(p_{12}=1-p_{11}\) and \(p_{22}=1-p_{21}\) in Eqs. 15 and 16, we obtain \(C_{12}-H_2 = (p_{11}-p_{21})(1-2p_{21})\), \(C_{12}-H_1 = (p_{21}-p_{11})(1-2p_{11})\), \(C_{12}-H_S = (p_{11}-p_{21})^2\), and \(C_{12}^2-H_1H_2 = (p_{11}-p_{21})^2\). Thus, because \(p_{11}=p_{21}\) is not permitted, the quantities in Eqs. 15 and 16 reduce to those of Eqs. 18 and 19, respectively.

To complete the application of Proposition 7 to \(K=2\), note that case (i) of Proposition 7 occurs when \((p_{11}-p_{21})(1-2p_{21}) > 0\) and \((p_{21}-p_{11})(1-2p_{11}) > 0\). The first of this pair of inequalities requires both \(p_{11} - p_{21} > 0\) and \(1-2p_{21} > 0\), so that \(p_{11} > p_{21}\) and \(\frac{1}{2} > p_{21}\), or both \(p_{11} - p_{21} < 0\) and \(1-2p_{21} < 0\), so that \(p_{11} < p_{21}\) and \(\frac{1}{2} < p_{21}\). The second inequality requires both \(p_{21} - p_{11} > 0\) and \(1-2p_{11} > 0\), so that \(p_{21} > p_{11}\) and \(\frac{1}{2} > p_{11}\), or both \(p_{21} - p_{11} < 0\) and \(1-2p_{11} < 0\), so that \(p_{21} < p_{11}\) and \(\frac{1}{2} < p_{11}\). Thus, the conditions of case (i) of Proposition 7 obtain if and only if \(p_{11}> \frac{1}{2} > p_{21}\) or \(p_{21}> \frac{1}{2} > p_{11}\).

Similarly, using the expressions for \(H_1\), \(H_2\), and \(C_{12}\) when \(K=2\), the conditions of case (ii) of Proposition 7 are equivalent to \(\frac{1}{2} \geqslant p_{21} > p_{11}\) or \(p_{11} > p_{21} \geqslant \frac{1}{2}\). The conditions of case (iii) are equivalent to \(\frac{1}{2} \geqslant p_{11} > p_{21}\) or \(p_{21} > p_{11} \geqslant \frac{1}{2}\).\(\square \)

Appendix 3: Dirichlet model for allele frequencies

We first provide results concerning \(H_{\mathrm {adm}}\) in the case that the K source populations have independently and identically distributed (IID) allele frequency vectors. Next, we specify these IID vectors to be Dirichlet distributions.

1.1 IID allele frequency vectors

We begin by examining the expected values of \(H_k\) and \(H_{\mathrm {adm}}\).

Proposition 13

Suppose the allele frequency vectors \(\underline{p_k}\) are independently and identically distributed for \(1 \leqslant k \leqslant K\). Then \({\mathbb {E}}[H_{\mathrm {adm}}] = {\mathbb {E}}[H_1] + ( 1 - \sum _{k=1}^K \gamma _k^2 ) ( \sum _{j=1}^J \mathrm {Var}[p_{1j}] )\).

Proof

We use Eq. 8:

$$\begin{aligned} {\mathbb {E}}[H_{\mathrm {adm}}] = 1 - \sum _{k=1}^K \gamma _k^2 \left( \sum _{j=1}^J {\mathbb {E}}[p_{kj}^2] \right) - 2 \sum _{k=1}^{K-1}\sum _{l=k+1}^K \gamma _k \gamma _\ell \left( \sum _{j=1}^J {\mathbb {E}}[p_{kj} p_{lj}] \right) . \end{aligned}$$

Using the IID assumption and simplifying by noting that \(1 = (\sum _{k=1}^K \gamma _k )^2 = ( \sum _{k=1}^K \gamma _k^2 ) + (2 \sum _{k=1}^{K-1} \sum _{\ell =k+1}^K \gamma _k \gamma _\ell )\), we have

$$\begin{aligned} {\mathbb {E}}[H_{\mathrm {adm}}]= & {} 1 - \left( \sum _{k=1}^K \gamma _k^2 \right) \left( \sum _{j=1}^J {\mathbb {E}}[p_{1j}^2] \right) - 2 \sum _{k=1}^{K-1}\sum _{\ell =k+1}^K \gamma _k \gamma _\ell \left[ \sum _{j=1}^J ({\mathbb {E}}[p_{1j}])^2 \right] \\= & {} 1 - \sum _{j=1}^J {\mathbb {E}}[p_{1j}^2] + \sum _{j=1}^J {\mathbb {E}}[p_{1j}^2] \left( 1 - \sum _{k=1}^K \gamma _k^2 \right) - \sum _{j=1}^J ({\mathbb {E}}[p_{1j}])^2 \left( 1 - \sum _{k=1}^K \gamma _k^2 \right) , \end{aligned}$$

from which the result follows.\(\square \)

An immediate corollary of Proposition 13 is that \(H_{\mathrm {adm}}\) has expectation greater than or equal to the expectation of the heterozygosity of each of the source populations.

Corollary 14

Suppose the allele frequency vectors \(\underline{p_k}\) are independently and identically distributed for \(1 \leqslant k \leqslant K\). Then \({\mathbb {E}}[H_{\mathrm {adm}}] \geqslant {\mathbb {E}}[H_k]\).

A second corollary results from the Cauchy–Schwarz inequality, by which \(\sum _{k=1}^K \gamma _k^2 \geqslant \frac{1}{K}\), with equality if and only if \((\gamma _1, \gamma _2, \ldots , \gamma _K) = (\frac{1}{K}, \frac{1}{K}, \ldots , \frac{1}{K})\).

Corollary 15

Suppose the allele frequency vectors \(\underline{p_k}\) are independently and identically distributed for \(1 \leqslant k \leqslant K\). Considering all admixture vectors \({\underline{\gamma }} \in \Delta ^{K-1}\), \({\mathbb {E}}[H_{\mathrm {adm}}]\) is maximized at \({\underline{\gamma }} = (\frac{1}{K}, \frac{1}{K}, \ldots , \frac{1}{K})\), and has maximal value \({\mathbb {E}}[H_1] + ( 1 - \frac{1}{K} ) \sum _{j=1}^J \mathrm {Var}[p_{1j}]\).

1.2 IID allele frequency vectors from a symmetric Dirichlet distribution

We now further assume that the independently and identically distributed allele frequency vectors follow a symmetric multivariate Dirichlet distribution. This distribution is frequently used for allele frequency distributions (Balding and Nichols 1995; Pritchard et al. 2000; Huelsenbeck and Andolfatto 2007), and it is a natural probability distribution to assume for allelic types with the same marginal distributions.

The J-dimensional Dirichlet-\((\alpha _1, \alpha _2, \ldots , \alpha _J)\) distribution is defined over the open unit \((J-1)\)-simplex \(\Delta ^{J-1}\) and has concentration parameters \(\alpha _j >0\). The means and variances for the individual allele frequencies are (Lange 1997; Kotz et al. 2000, chapter 49):

$$\begin{aligned} {\mathbb {E}}[p_{kj}]= & {} \frac{\alpha _j}{J{\overline{\alpha }}} \\ \mathrm {Var}[p_{kj}]= & {} \frac{\alpha _j(J{\overline{\alpha }} - \alpha _j)}{J^2{\overline{\alpha }}^2(J{\overline{\alpha }}+1)}, \end{aligned}$$

where \({\overline{\alpha }} = \frac{1}{J}\sum _{j=1}^J \alpha _j\).

The symmetric Dirichlet distribution assumes \(\alpha _1 = \alpha _2 = \ldots = \alpha _J = {\overline{\alpha }}\), leading to:

$$\begin{aligned} {\mathbb {E}}[p_{kj}]= & {} \frac{1}{J} \\ \mathrm {Var}[p_{kj}]= & {} \frac{J-1}{J^2 (J {\overline{\alpha }} + 1)}. \end{aligned}$$

Making these substitutions in Proposition 13, we obtain the expectation of \(H_{\mathrm {adm}}\) under the assumption that the allele frequency vectors follow independent Dirichlet distributions.

Corollary 16

Suppose the allele frequency vectors \(\underline{p_k}\) are independently and identically distributed for \(1 \leqslant k \leqslant K\), all with symmetric multivariate Dirichlet distributions with concentration parameter \({\overline{\alpha }}\). Then

$$\begin{aligned} {\mathbb {E}}[H_k]= & {} \left( 1 - \frac{1}{J} \right) \left( 1 - \frac{1}{J{\overline{\alpha }}+1} \right) , \\ {\mathbb {E}}[H_{\mathrm {adm}}]= & {} \left( 1 - \frac{1}{J} \right) \left( 1- \frac{1}{J{\overline{\alpha }}+1}\sum _{k=1}^K \gamma _k^2 \right) . \end{aligned}$$

This corollary implies that both \({\mathbb {E}}[H_k]\) and \({\mathbb {E}}[H_{\mathrm {adm}}]\) are increasing functions of J and \({\overline{\alpha }}\).

The next proposition considers the special case of \(K=2\) and \(J=2\), further specifying a uniform distribution for \(\gamma _1\).

Proposition 17

Consider \(K=2\) and \(J=2\). Suppose that the values of \(p_{11}\) and \(p_{21}\) are independently chosen from a uniform-[0,1] distribution. Suppose also that \(\gamma _1\) is also chosen from a uniform-[0,1] distribution. Then \({\mathbb {P}}[H_{\mathrm {adm}}(\gamma _1) > \max \{H_1,H_2\}] = 1 - \log 2 \approx 0.307\).

Proof

Using Proposition 11, we identify the regions of the unit square for \((p_{11},p_{21})\) in which \(\max _{\gamma _1 \in (0,1)} H_{\mathrm {adm}}(\gamma _1) > \max \{H_1,H_2\}\). These regions are \(\{ (p_{11},p_{21}) \,|\, \frac{1}{2}< p_{11}< 1, 0< p_{21} < \frac{1}{2} \}\) and \(\{ (p_{11},p_{21}) \,|\, 0< p_{11}< \frac{1}{2}, \frac{1}{2}< p_{21} < 1 \}\).

Within those regions, we must determine the portion of the unit interval for \(\gamma _1\) in which \(H_{\mathrm {adm}}(\gamma _1) > \max \{H_1,H_2\}\). \(H_{\mathrm {adm}}(\gamma _1)\) is a quadratic function of \(\gamma _1\). We ignore the set of zero volume with \(H_1=H_2\). In the regions for \((p_{11},p_{21})\) in which \(\max _{\gamma _1 \in (0,1)} H_{\mathrm {adm}}(\gamma _1) > \max \{H_1,H_2\}\) and \(H_2 > H_1\), the interval for \(\gamma _1\) in which \(H_{\mathrm {adm}}(\gamma _1) > H_1\) is \((0,\frac{1-2p_{21}}{p_{11}-p_{21}})\). In the regions for \((p_{11},p_{21})\) in which \(\max _{\gamma _1 \in (0,1)} H_{\mathrm {adm}}(\gamma _1) > \max \{H_1,H_2\}\) and \(H_1 > H_2\), the interval for \(\gamma _1\) in which \(H_{\mathrm {adm}}(\gamma _1) > H_1\) is \((\frac{p_{21}-1+p_{11}}{p_{21}-p_{11}}, 1)\).

The desired probability is the volume within the unit cube for \((p_{11}, p_{21}, \gamma _1)\) of the regions in which \(H_{\mathrm {adm}}(\gamma _1) > \max \{H_1,H_2\}\). The volume is

$$\begin{aligned}&\int _{1/2}^1 \int _{1-p_{11}}^{1/2} \int _{0}^{\frac{1-2p_{21}}{p_{11}-p_{21}}} 1 \, \mathrm {d}\gamma _1 \, \mathrm {d}p_{21} \, \mathrm {d}p_{11} + \int _{1/2}^1 \! \int _{0}^{1-p_{11}} \! \int _{\frac{p_{21}-1+p_{11}}{p_{21}-p_{11}}}^1 1 \, \mathrm {d}\gamma _1 \, \mathrm {d}p_{21} \, \mathrm {d}p_{11} \\&\int _{0}^{1/2} \! \int _{1-p_{11}}^1 \! \int _{\frac{p_{21}-1+p_{11}}{p_{21}-p_{11}}}^1 1 \, \mathrm {d}\gamma _1 \, \mathrm {d}p_{21} \, \mathrm {d}p_{11} + \int _{0}^{1/2} \! \int _{1/2}^{1-p_{11}} \! \int _{0}^{\frac{1-2p_{21}}{p_{11}-p_{21}}} 1 \, \mathrm {d}\gamma _1 \, \mathrm {d}p_{21} \, \mathrm {d}p_{11} \\&= 4 \frac{1- \log 2}{4}. \end{aligned}$$

\(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Boca, S.M., Huang, L. & Rosenberg, N.A. On the heterozygosity of an admixed population. J. Math. Biol. 81, 1217–1250 (2020). https://doi.org/10.1007/s00285-020-01531-9

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00285-020-01531-9

Keywords

Mathematics Subject Classification

Navigation