Skip to main content
Log in

Impossibility of Consistent Distance Estimation from Sequence Lengths Under the TKF91 Model

  • Original Article
  • Published:
Bulletin of Mathematical Biology Aims and scope Submit manuscript

Abstract

We consider the problem of distance estimation under the TKF91 model of sequence evolution by insertions, deletions and substitutions on a phylogeny. In an asymptotic regime where the expected sequence lengths tend to infinity, we show that no consistent distance estimation is possible from sequence lengths alone. More formally, we establish that the distributions of pairs of sequence lengths at different distances cannot be distinguished with probability going to one.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Allman ES, Rhodes JA, Sullivant S (2015) Statistically consistent k-mer methods for phylogenetic tree reconstruction. J Comput Biol J Comput Mol Cell Biol 24(2):153–171

    Article  MathSciNet  Google Scholar 

  • Daskalakis C, Roch S (2013) Alignment-free phylogenetic reconstruction: sample complexity via a branching process analysis. Ann Appl Probab 23(2):693–721

    Article  MathSciNet  Google Scholar 

  • Durrett R (2010) Probability theory and examples, 4th edn. Cambridge series in statistical and probabilistic mathematics. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Fan W-TL, Roch S (2020) Statistically consistent and computationally efficient inference of ancestral dna sequences in the TKF91 model under dense taxon sampling. Bull Math Biol 82(2):21

    Article  MathSciNet  Google Scholar 

  • Haubold B (2013) Alignment-free phylogenetics and population genetics. Briefings Bioinf 15(3):407–418

    Article  Google Scholar 

  • Mike S (2016) Phylogeny—discrete and random processes in evolution. In: CBMS-NSF regional conference series in applied mathematics. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA

  • Thatte BD (2006) Invertibility of the TKF model of sequence evolution. Math Biosci 200(1):58–75

    Article  MathSciNet  Google Scholar 

  • Thorne JL, Kishino H, Felsentein J (1991) An evolutionary model for maximum likelihood alignment of DNA sequences. J Mol Evol 33(2):114–124

    Article  Google Scholar 

  • Thorne JL, Kishino H, Felsenstein J (1992) Inching toward reality: an improved likelihood model of sequence evolution. J Mol Evol 34(1):3–16

    Article  Google Scholar 

  • Warnow T (2017) Computational phylogenetics: an introduction to designing methods for phylogeny estimation, 1st edn. Cambridge University Press, Cambridge, USA

    Book  Google Scholar 

  • Yang K, Zhang L (2008) Performance comparison between k-tuple distance and four model-based distances in phylogenetic tree reconstruction. Nucleic Acids Res 36(5):e33–e33

    Article  Google Scholar 

Download references

Acknowledgements

SR was supported by NSF grants DMS-1614242, CCF-1740707 (TRIPODS), DMS-1902892, and DMS-1916378, as well as a Simons Fellowship and a Vilas Associates Award. BL was supported by DMS-1614242, CCF-1740707 (TRIPODS), DMS-1902892 (to SR). WTF was supported by NSF grants DMS-1614242 (to SR) and DMS-1855417, and ONR-TCRI N00014-20-1-2411.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sebastien Roch.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fan, WT.L., Legried, B. & Roch, S. Impossibility of Consistent Distance Estimation from Sequence Lengths Under the TKF91 Model. Bull Math Biol 82, 123 (2020). https://doi.org/10.1007/s11538-020-00801-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11538-020-00801-3

Navigation