Abstract
A matrix Lie algebra is a linear space of matrices closed under the operation \( [A, B] = AB-BA \). The “Lie closure” of a set of matrices is the smallest matrix Lie algebra which contains the set. In the context of Markov chain theory, if a set of rate matrices form a Lie algebra, their corresponding Markov matrices are closed under matrix multiplication; this has been found to be a useful property in phylogenetics. Inspired by previous research involving Lie closures of DNA models, it was hypothesised that finding the Lie closure of a codon model could help to solve the problem of mis-estimation of the non-synonymous/synonymous rate ratio, \(\omega \). We propose two different methods of finding a linear space from a model: the first is the linear closure which is the smallest linear space which contains the model, and the second is the linear version which changes multiplicative constraints in the model to additive ones. For each of these linear spaces we then find the Lie closures of them. Under both methods, it was found that closed codon models would require thousands of parameters, and that any partial solution to this problem that was of a reasonable size violated stochasticity. Investigation of toy models indicated that finding the Lie closure of matrix linear spaces which deviated only slightly from a simple model resulted in a Lie closure that was close to having the maximum number of parameters possible. Given that Lie closures are not practical, we propose further consideration of the two variants of linearly closed models.
Similar content being viewed by others
Notes
We note that the LM models given in Fernández-Sánchez et al. (2015) have purine/pyrimidine symmetry which means if we permute nucleotides such that the partitioning of nucleotides into purine/pyrimidine is unchanged, then we obtain a rate matrix that belongs to the same model.
References
Barry D, Hartigan JA (1987) Asynchronous distance between homologous DNA sequences. Biometrics 43:261–276
Bashford J, Jarvis PD (2000) The genetic code as a periodic table: algebraic aspects. Biosystems 57(3):147–161
Bashford J, Tsohantjis I, Jarvis P (1998) A supersymmetric model for the evolution of the genetic code. Proc Natl Acad Sci 95(3):987–992
Bennett SN, Holmes EC, Chirivella M, Rodriguez DM, Beltran M, Vorndam V, Gubler DJ, McMillan WO (2006) Molecular evolution of dengue 2 virus in puerto rico: positive selection in the viral envelope accompanies clade reintroduction. J Gen Virol 87(4):885–893
Felsenstein J (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17(6):368–376
Fernández-Sánchez J, Sumner JG, Jarvis PD, Woodhams MD (2015) Lie Markov models with purine/pyrimidine symmetry. J Math Biol 70(4):855–891
Hasegawa M, Kishino H, Ta Yano (1985) Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol 22(2):160–174
Hornos JEM, Hornos YM (1993) Algebraic model for the evolution of the genetic code. Phys Rev Lett 71(26):4401
Jukes TH, Cantor CR et al (1969) Evolution of protein molecules. Mamm Protein Metab 3(21):132
Kaine BT (2011) The effect of closure in phylogenetics. Honour’s thesis, University of Tasmania
Kimura M (1980) A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol 16(2):111–120
Muse SV, Gaut BS (1994) A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome. Mol Biol Evol 11(5):715–724
Sánchez R, Grau R, Morgado E (2006) A novel Lie algebra of the genetic code over the Galois field of four DNA bases. Math Biosci 202(1):156–174
Shen J, Kirk BD, Ma J, Wang Q (2009) Diversifying selective pressure on influenza b virus hemagglutinin. J Med Virol 81(1):114–124
Sumner JG (2017) Multiplicatively closed Markov models must form Lie algebras. ANZIAM J 59(2):240–246
Sumner JG, Fernández-Sánchez J, Jarvis PD (2012a) Lie Markov models. J Theor Biol 298:16–31
Sumner JG, Jarvis PD, Fernández-Sánchez J, Kaine BT, Woodhams MD, Holland BR (2012b) Is the general time-reversible model bad for molecular phylogenetics? Syst Biol 61(6):1069–1074
Tamura K, Nei M (1993) Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol Biol Evol 10(3):512–526
Tavaré S (1986) Some probabilistic and statistical problems in the analysis of DNA sequences. Lect Math Life Sci 17(2):57–86
Woodhams MD, Fernández-Sánchez J, Sumner JG (2015) A new hierarchy of phylogenetic models consistent with heterogeneous substitution rates. Syst Biol 64(4):638–650
Woodhams MD, Sumner JG, Liberles DA, Charleston MA, Holland BR (2017) Exploring the consequences of lack of closure in codon models. arXiv:1709.05079
Yang Z (1997) PAML: a program package for phylogenetic analysis by maximum likelihood. Bioinformatics 13:555–556
Yang Z (1998) Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution. Mol Biol Evol 15(5):568–573
Acknowledgements
We thank Andrey Bytsko for pointing out an error in an early draft regarding computation of the linear closures. We also thank the anonymous reviewers for their thorough reading of the manuscript and insightful comments that have led to a substantially improved article. Funding was provided by Australian Research Council (Grant No. DP150100088).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This research was supported by Australian Research Council (ARC) Discovery Grant DP150100088 to Barbara R. Holland and Jeremy G. Sumner and Australian Research Training Program scholarship to Julia A. Shore.
Rights and permissions
About this article
Cite this article
Shore, J.A., Sumner, J.G. & Holland, B.R. The impracticalities of multiplicatively-closed codon models: a retreat to linear alternatives. J. Math. Biol. 81, 549–573 (2020). https://doi.org/10.1007/s00285-020-01519-5
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00285-020-01519-5