当前位置: X-MOL 学术Nucleic Acids Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A fully-automated method discovers loss of mouse-lethal and human-monogenic disease genes in 58 mammals.
Nucleic Acids Research ( IF 14.9 ) Pub Date : 2020-07-02 , DOI: 10.1093/nar/gkaa550
Yatish Turakhia 1 , Heidi I Chen 2 , Amir Marcovitz 2 , Gill Bejerano 2, 3, 4, 5
Affiliation  

Gene losses provide an insightful route for studying the morphological and physiological adaptations of species, but their discovery is challenging. Existing genome annotation tools focus on annotating intact genes and do not attempt to distinguish nonfunctional genes from genes missing annotation due to sequencing and assembly artifacts. Previous attempts to annotate gene losses have required significant manual curation, which hampers their scalability for the ever-increasing deluge of newly sequenced genomes. Using extreme sequence erosion (amino acid deletions and substitutions) and sister species support as an unambiguous signature of loss, we developed an automated approach for detecting high-confidence gene loss events across a species tree. Our approach relies solely on gene annotation in a single reference genome, raw assemblies for the remaining species to analyze, and the associated phylogenetic tree for all organisms involved. Using human as reference, we discovered over 400 unique human ortholog erosion events across 58 mammals. This includes dozens of clade-specific losses of genes that result in early mouse lethality or are associated with severe human congenital diseases. Our discoveries yield intriguing potential for translational medical genetics and evolutionary biology, and our approach is readily applicable to large-scale genome sequencing efforts across the tree of life.

中文翻译:

全自动方法发现了58只哺乳动物的小鼠致死性和人类单基因疾病基因的丢失。

基因损失为研究物种的形态和生理适应提供了一种有见地的途径,但其发现具有挑战性。现有的基因组注释工具专注于注释完整的基因,而不是试图将非功能性基因与由于测序和装配伪像而缺失注释的基因区分开。以前的注释基因丢失的尝试需要大量的手动管理,这阻碍了它们的可扩展性,以适应不断增加的新测序基因组的泛滥。使用极端序列侵蚀(氨基酸缺失和取代)和姊妹物种支持作为损失的明确特征,我们开发了一种自动方法来检测整个物种树中的高可信度基因损失事件。我们的方法仅依靠单个参考基因组中的基因注释,要分析的其余物种的原始程序集,以及涉及的所有生物的相关系统树。以人类为参考,我们发现了58个哺乳动物的400多个独特的人类直系同源物侵蚀事件。这包括进化枝特异性基因的数十种丢失,这些基因导致小鼠早期致死或与严重的人类先天性疾病有关。我们的发现为转化医学遗传学和进化生物学带来了诱人的潜力,我们的方法很容易应用于生命树上的大规模基因组测序工作。这包括进化枝特异性基因的数十种丢失,这些基因导致小鼠早期致死或与严重的人类先天性疾病有关。我们的发现为转化医学遗传学和进化生物学带来了诱人的潜力,我们的方法很容易应用于生命树上的大规模基因组测序工作。这包括进化枝特异性基因的数十种丢失,这些基因导致小鼠早期致死或与严重的人类先天性疾病有关。我们的发现为转化医学遗传学和进化生物学提供了诱人的潜力,并且我们的方法很容易应用于生命树上的大规模基因组测序工作。
更新日期:2020-07-02
down
wechat
bug