当前位置: X-MOL 学术Genome Biol. Evol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Long Reads Are Revolutionizing 20 Years of Insect Genome Sequencing
Genome Biology and Evolution ( IF 3.2 ) Pub Date : 2021-06-11 , DOI: 10.1093/gbe/evab138
Scott Hotaling 1 , John S Sproul 2 , Jacqueline Heckenhauer 3, 4 , Ashlyn Powell 5 , Amanda M Larracuente 2 , Steffen U Pauls 3, 4, 6 , Joanna L Kelley 1 , Paul B Frandsen 3, 5, 7
Affiliation  

The first insect genome assembly (Drosophila melanogaster) was published two decades ago. Today, nuclear genome assemblies are available for a staggering 601 insect species representing 20 orders. In this study, we analyzed the most-contiguous assembly for each species and provide a “state-of-the-field” perspective, emphasizing taxonomic representation, assembly quality, gene completeness, and sequencing technologies. Relative to species richness, genomic efforts have been biased toward four orders (Diptera, Hymenoptera, Collembola, and Phasmatodea), Coleoptera are underrepresented, and 11 orders still lack a publicly available genome assembly. The average insect genome assembly is 439.2 Mb in length with 87.5% of single-copy benchmarking genes intact. Most notable has been the impact of long-read sequencing; assemblies that incorporate long reads are ∼48× more contiguous than those that do not. We offer four recommendations as we collectively continue building insect genome resources: 1) seek better integration between independent research groups and consortia, 2) balance future sampling between filling taxonomic gaps and generating data for targeted questions, 3) take advantage of long-read sequencing technologies, and 4) expand and improve gene annotations.

中文翻译:


长读长正在彻底改变 20 年来的昆虫基因组测序



第一个昆虫基因组组装(黑腹果蝇)于二十年前发表。如今,核基因组组装可用于代表 20 个目的 601 种昆虫。在这项研究中,我们分析了每个物种最连续的组装,并提供了“现状”视角,强调分类学代表性、组装质量、基因完整性和测序技术。相对于物种丰富度,基因组工作偏向于四个目(双翅目、膜翅目、跳虫目和斑节目),鞘翅目代表性不足,还有 11 个目仍然缺乏公开可用的基因组组装。昆虫基因组组装的平均长度为 439.2 Mb,其中 87.5% 的单拷贝基准基因完好无损。最值得注意的是长读长测序的影响。包含长读的组件的连续性比不包含长读的组件的连续性高约 48 倍。在我们共同继续建设昆虫基因组资源时,我们提出了四项建议:1)寻求独立研究小组和联盟之间更好的整合,2)在填补分类学空白和为目标问题生成数据之间平衡未来采样,3)利用长读长测序技术,4)扩展和改进基因注释。
更新日期:2021-06-11
down
wechat
bug