当前位置: X-MOL 学术IEEE/ACM Trans. Comput. Biol. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Sorting Permutations by Intergenic Operations
IEEE/ACM Transactions on Computational Biology and Bioinformatics ( IF 3.6 ) Pub Date : 2021-05-04 , DOI: 10.1109/tcbb.2021.3077418
Andre Rodrigues Oliveira , Geraldine Jean , Guillaume Fertin , Klairton Lima Brito , Ulisses Dias , Zanoni Dias

Genome Rearrangements are events that affect large stretches of genomes during evolution. Many mathematical models have been used to estimate the evolutionary distance between two genomes based on genome rearrangements. However, most of them focused on the (order of the) genes of a genome, disregarding other important elements in it. Recently, researchers have shown that considering regions between each pair of genes, called intergenic regions, can enhance distance estimation in realistic data. Two of the most studied genome rearrangements are the reversal, which inverts a sequence of genes, and the transposition, which occurs when two adjacent gene sequences swap their positions inside the genome. In this work, we study the transposition distance between two genomes, but we also consider intergenic regions, a problem we name Sorting by Intergenic Transpositions. We show that this problem is NP-hard and propose two approximation algorithms, with factors 3.5 and 2.5, considering two distinct definitions for the problem. We also investigate the signed reversal and transposition distance between two genomes considering their intergenic regions. This second problem is called Sorting by Signed Intergenic Reversals and Intergenic Transpositions. We show that this problem is NP-hard and develop two approximation algorithms, with factors 3 and 2.5. We check how these algorithms behave when assigning weights for genome rearrangements. Finally, we implemented all these algorithms and tested them on real and simulated data.

中文翻译:


按基因间操作排序排列



基因组重排是在进化过程中影响大范围基因组的事件。许多数学模型已被用来根据基因组重排来估计两个基因组之间的进化距离。然而,他们中的大多数都关注基因组的基因(顺序),而忽略了其中的其他重要元素。最近,研究人员表明,考虑每对基因之间的区域(称为基因间区域)可以增强实际数据中的距离估计。研究最多的两种基因组重排是反转(反转基因序列)和转座(当两个相邻基因序列在基因组内交换位置时发生)。在这项工作中,我们研究两个基因组之间的转座距离,但我们也考虑基因间区域,我们将这个问题称为“基因间转座排序”。我们证明这个问题是 NP 困难的,并考虑到该问题的两个不同定义,提出了两种近似算法,因子为 3.5 和 2.5。我们还考虑到两个基因组的基因间区域,研究了它们之间的有符号反转和转座距离。第二个问题称为按有符号基因间逆转和基因间转位进行排序。我们证明这个问题是 NP 困难的,并开发了两种近似算法,因子为 3 和 2.5。我们检查这些算法在为基因组重排分配权重时的表现。最后,我们实现了所有这些算法,并在真实和模拟数据上对其进行了测试。
更新日期:2021-05-04
down
wechat
bug