当前位置: X-MOL 学术bioRxiv. Genom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Comparative genomics and community curation further improve gene annotations in the nematode Pristionchus pacificus
bioRxiv - Genomics Pub Date : 2020-08-04 , DOI: 10.1101/2020.08.03.233726
Marina Athanasouli , Hanh Witte , Christian Weiler , Tobias Loschko , Gabi Eberhardt , Ralf J. Sommer , Christian Rödelsperger

Background: Nematode model organisms such as Caenorhabditis elegans and Pristionchus pacificus are powerful systems for studying the evolution of gene function at a mechanistic level. However, the identification of P. pacificus orthologs of candidate genes known from C. elegans is complicated by the discrepancy in the quality of gene annotations, a common problem in nematode and invertebrate genomics. Results: Here, we combine comparative genomic screens for suspicious gene models with community-based curation to further improve the quality of gene annotations in P. pacificus. We extend previous curations of one-to-one orthologs to larger gene families and also orphan genes. Cross-species comparisons of protein lengths and screens for atypical domain combinations and species-specific orphan genes resulted in 4,221 candidate genes that were subject to community-based curation. Corrections for 2,851 gene models were implemented in a new version of the P. pacificus gene annotations. The new set of gene annotations contains 28,896 genes and has a single copy ortholog completeness level of 97.6%. Conclusions: Our work demonstrates the effectiveness of comparative genomic screens to identify suspicious gene models and the scalability of community-based approaches to improve the quality of thousands of gene models. Similar community-based approaches can help to improve the quality of gene annotations in other invertebrate species, including parasitic nematodes.

中文翻译:

比较基因组学和社区管理进一步改善了线虫Pristionchus pacificus的基因注释

背景:线虫模型生物(如秀丽隐杆线虫(Caenorhabditis elegans)和大白蚁(Pristionchus pacificus))是在机制水平上研究基因功能进化的强大系统。但是,由于线虫和无脊椎动物基因组学中的常见问题,基因注释的质量差异,使从秀丽隐杆线虫中已知的候选候选基因的直向同源基因的鉴定变得复杂。结果:在这里,我们将针对可疑基因模型的比较基因组筛选与基于社区的管理相结合,以进一步提高太平洋假单胞菌的基因注释质量。我们将以前的一对一直系同源基因扩展到更大的基因家族以及孤儿基因。对非典型结构域组合和物种特异性孤儿基因的蛋白质长度和筛查进行跨物种比较,结果得出4,221个候选基因受到社区管理。在新版本的P. pacificus基因注释中实现了对2,851个基因模型的校正。新的基因注释集包含28,896个基因,并且其单拷贝直系同源物完整性水平为97.6%。结论:我们的工作证明了比较基因组筛查可识别可疑基因模型的有效性,以及基于社区的方法的可扩展性,可提高数千种基因模型的质量。类似的基于社区的方法可以帮助提高其他无脊椎动物物种(包括寄生线虫)中基因注释的质量。896个基因,并且其单拷贝直系同源物完整性水平为97.6%。结论:我们的工作证明了比较基因组筛查可识别可疑基因模型的有效性,以及基于社区的方法的可扩展性,可提高数千种基因模型的质量。类似的基于社区的方法可以帮助提高其他无脊椎动物物种(包括寄生线虫)中基因注释的质量。896个基因,并且其单拷贝直系同源物完整性水平为97.6%。结论:我们的工作证明了比较基因组筛查可识别可疑基因模型的有效性,以及基于社区的方法的可扩展性,可提高数千种基因模型的质量。类似的基于社区的方法可以帮助提高其他无脊椎动物物种(包括寄生线虫)中基因注释的质量。
更新日期:2020-08-04
down
wechat
bug