当前位置: X-MOL 学术BMC Genomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A generalized Robinson-Foulds distance for labeled trees
BMC Genomics ( IF 3.5 ) Pub Date : 2020-11-18 , DOI: 10.1186/s12864-020-07011-0
Samuel Briand , Christophe Dessimoz , Nadia El-Mabrouk , Manuel Lafond , Gabriela Lobinska

The Robinson-Foulds (RF) distance is a well-established measure between phylogenetic trees. Despite a lack of biological justification, it has the advantages of being a proper metric and being computable in linear time. For phylogenetic applications involving genes, however, a crucial aspect of the trees ignored by the RF metric is the type of the branching event (e.g. speciation, duplication, transfer, etc). We extend RF to trees with labeled internal nodes by including a node flip operation, alongside edge contractions and extensions. We explore properties of this extended RF distance in the case of a binary labeling. In particular, we show that contrary to the unlabeled case, an optimal edit path may require contracting “good” edges, i.e. edges shared between the two trees. We provide a 2-approximation algorithm which is shown to perform well empirically. Looking ahead, computing distances between labeled trees opens up a variety of new algorithmic directions.Implementation and simulations available at https://github.com/DessimozLab/pylabeledrf .

中文翻译:

标记树的广义Robinson-Foulds距离

Robinson-Foulds(RF)距离是系统树之间公认的度量。尽管缺乏生物学上的依据,但它的优点是可以作为适当的度量标准并可以在线性时间内进行计算。但是,对于涉及基因的系统发育应用,RF度量忽略的树木的关键方面是分支事件的类型(例如,物种形成,复制,转移等)。通过包括节点翻转操作以及边缘收缩和扩展,我们将RF扩展到带有标记内部节点的树。我们在二进制标记的情况下探索了这种扩展的RF距离的特性。特别是,我们表明,与未标记的情况相反,最佳编辑路径可能需要收缩“好”边缘,即两棵树之间共享的边缘。我们提供了一种2近似算法,该算法在经验上表现良好。展望未来,计算标记树之间的距离将开辟各种新的算法方向.https://github.com/DessimozLab/pylabeledrf上提供了实现和模拟。
更新日期:2020-11-19
down
wechat
bug