当前位置: X-MOL 学术Brief. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Distance-based phylogenetic inference from typing data: a unifying view.
Briefings in Bioinformatics ( IF 9.5 ) Pub Date : 2020-07-31 , DOI: 10.1093/bib/bbaa147
Cátia Vaz 1 , Marta Nascimento 2 , João A Carriço 3 , Tatiana Rocher 4 , Alexandre P Francisco 5
Affiliation  

Typing methods are widely used in the surveillance of infectious diseases, outbreaks investigation and studies of the natural history of an infection. Moreover, their use is becoming standard, in particular with the introduction of high-throughput sequencing. On the other hand, the data being generated are massive and many algorithms have been proposed for a phylogenetic analysis of typing data, addressing both correctness and scalability issues. Most of the distance-based algorithms for inferring phylogenetic trees follow the closest pair joining scheme. This is one of the approaches used in hierarchical clustering. Moreover, although phylogenetic inference algorithms may seem rather different, the main difference among them resides on how one defines cluster proximity and on which optimization criterion is used. Both cluster proximity and optimization criteria rely often on a model of evolution. In this work, we review, and we provide a unified view of these algorithms. This is an important step not only to better understand such algorithms but also to identify possible computational bottlenecks and improvements, important to deal with large data sets.

中文翻译:

从类型数据中基于距离的系统发育推断:一个统一的观点。

打字方法广泛用于传染病监测、暴发调查和感染自然史研究。此外,它们的使用正成为标准,特别是随着高通量测序的引入。另一方面,正在生成的数据是海量的,并且已经提出了许多算法来对类型数据进行系统发育分析,以解决正确性和可扩展性问题。大多数用于推断系统发育树的基于距离的算法都遵循最近对连接方案。这是层次聚类中使用的方法之一。此外,虽然系统发育推理算法可能看起来相当不同,但它们之间的主要区别在于如何定义集群接近度以及使用哪种优化标准。集群邻近度和优化标准通常都依赖于进化模型。在这项工作中,我们回顾并提供了这些算法的统一视图。这是重要的一步,不仅可以更好地理解此类算法,而且可以识别可能的计算瓶颈和改进,这对于处理大型数据集很重要。
更新日期:2020-07-31
down
wechat
bug