Skip to main content
Log in

The impracticalities of multiplicatively-closed codon models: a retreat to linear alternatives

  • Published:
Journal of Mathematical Biology Aims and scope Submit manuscript

Abstract

A matrix Lie algebra is a linear space of matrices closed under the operation \( [A, B] = AB-BA \). The “Lie closure” of a set of matrices is the smallest matrix Lie algebra which contains the set. In the context of Markov chain theory, if a set of rate matrices form a Lie algebra, their corresponding Markov matrices are closed under matrix multiplication; this has been found to be a useful property in phylogenetics. Inspired by previous research involving Lie closures of DNA models, it was hypothesised that finding the Lie closure of a codon model could help to solve the problem of mis-estimation of the non-synonymous/synonymous rate ratio, \(\omega \). We propose two different methods of finding a linear space from a model: the first is the linear closure which is the smallest linear space which contains the model, and the second is the linear version which changes multiplicative constraints in the model to additive ones. For each of these linear spaces we then find the Lie closures of them. Under both methods, it was found that closed codon models would require thousands of parameters, and that any partial solution to this problem that was of a reasonable size violated stochasticity. Investigation of toy models indicated that finding the Lie closure of matrix linear spaces which deviated only slightly from a simple model resulted in a Lie closure that was close to having the maximum number of parameters possible. Given that Lie closures are not practical, we propose further consideration of the two variants of linearly closed models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. We note that the LM models given in Fernández-Sánchez et al. (2015) have purine/pyrimidine symmetry which means if we permute nucleotides such that the partitioning of nucleotides into purine/pyrimidine is unchanged, then we obtain a rate matrix that belongs to the same model.

References

  • Barry D, Hartigan JA (1987) Asynchronous distance between homologous DNA sequences. Biometrics 43:261–276

    Article  MathSciNet  Google Scholar 

  • Bashford J, Jarvis PD (2000) The genetic code as a periodic table: algebraic aspects. Biosystems 57(3):147–161

    Article  Google Scholar 

  • Bashford J, Tsohantjis I, Jarvis P (1998) A supersymmetric model for the evolution of the genetic code. Proc Natl Acad Sci 95(3):987–992

    Article  Google Scholar 

  • Bennett SN, Holmes EC, Chirivella M, Rodriguez DM, Beltran M, Vorndam V, Gubler DJ, McMillan WO (2006) Molecular evolution of dengue 2 virus in puerto rico: positive selection in the viral envelope accompanies clade reintroduction. J Gen Virol 87(4):885–893

    Article  Google Scholar 

  • Felsenstein J (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17(6):368–376

    Article  Google Scholar 

  • Fernández-Sánchez J, Sumner JG, Jarvis PD, Woodhams MD (2015) Lie Markov models with purine/pyrimidine symmetry. J Math Biol 70(4):855–891

    Article  MathSciNet  Google Scholar 

  • Hasegawa M, Kishino H, Ta Yano (1985) Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol 22(2):160–174

    Article  Google Scholar 

  • Hornos JEM, Hornos YM (1993) Algebraic model for the evolution of the genetic code. Phys Rev Lett 71(26):4401

    Article  Google Scholar 

  • Jukes TH, Cantor CR et al (1969) Evolution of protein molecules. Mamm Protein Metab 3(21):132

    Google Scholar 

  • Kaine BT (2011) The effect of closure in phylogenetics. Honour’s thesis, University of Tasmania

  • Kimura M (1980) A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol 16(2):111–120

    Article  Google Scholar 

  • Muse SV, Gaut BS (1994) A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome. Mol Biol Evol 11(5):715–724

    Google Scholar 

  • Sánchez R, Grau R, Morgado E (2006) A novel Lie algebra of the genetic code over the Galois field of four DNA bases. Math Biosci 202(1):156–174

    Article  MathSciNet  Google Scholar 

  • Shen J, Kirk BD, Ma J, Wang Q (2009) Diversifying selective pressure on influenza b virus hemagglutinin. J Med Virol 81(1):114–124

    Article  Google Scholar 

  • Sumner JG (2017) Multiplicatively closed Markov models must form Lie algebras. ANZIAM J 59(2):240–246

    Article  MathSciNet  Google Scholar 

  • Sumner JG, Fernández-Sánchez J, Jarvis PD (2012a) Lie Markov models. J Theor Biol 298:16–31

    Article  MathSciNet  Google Scholar 

  • Sumner JG, Jarvis PD, Fernández-Sánchez J, Kaine BT, Woodhams MD, Holland BR (2012b) Is the general time-reversible model bad for molecular phylogenetics? Syst Biol 61(6):1069–1074

    Article  Google Scholar 

  • Tamura K, Nei M (1993) Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol Biol Evol 10(3):512–526

    Google Scholar 

  • Tavaré S (1986) Some probabilistic and statistical problems in the analysis of DNA sequences. Lect Math Life Sci 17(2):57–86

    MathSciNet  MATH  Google Scholar 

  • Woodhams MD, Fernández-Sánchez J, Sumner JG (2015) A new hierarchy of phylogenetic models consistent with heterogeneous substitution rates. Syst Biol 64(4):638–650

    Article  Google Scholar 

  • Woodhams MD, Sumner JG, Liberles DA, Charleston MA, Holland BR (2017) Exploring the consequences of lack of closure in codon models. arXiv:1709.05079

  • Yang Z (1997) PAML: a program package for phylogenetic analysis by maximum likelihood. Bioinformatics 13:555–556

    Article  Google Scholar 

  • Yang Z (1998) Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution. Mol Biol Evol 15(5):568–573

    Article  Google Scholar 

Download references

Acknowledgements

We thank Andrey Bytsko for pointing out an error in an early draft regarding computation of the linear closures. We also thank the anonymous reviewers for their thorough reading of the manuscript and insightful comments that have led to a substantially improved article. Funding was provided by Australian Research Council (Grant No. DP150100088).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Julia A. Shore.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This research was supported by Australian Research Council (ARC) Discovery Grant DP150100088 to Barbara R. Holland and Jeremy G. Sumner and Australian Research Training Program scholarship to Julia A. Shore.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shore, J.A., Sumner, J.G. & Holland, B.R. The impracticalities of multiplicatively-closed codon models: a retreat to linear alternatives. J. Math. Biol. 81, 549–573 (2020). https://doi.org/10.1007/s00285-020-01519-5

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00285-020-01519-5

Keywords

Mathematics Subject Classification

Navigation