当前位置: X-MOL 学术Mol. Ecol. Resour. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Debar: A sequence-by-sequence denoiser for COI-5P DNA barcode data
Molecular Ecology Resources ( IF 5.5 ) Pub Date : 2021-03-22 , DOI: 10.1111/1755-0998.13384
Cameron M Nugent 1, 2 , Tyler A Elliott 2 , Sujeevan Ratnasingham 2 , Paul D N Hebert 2 , Sarah J Adamowicz 1
Affiliation  

DNA barcoding and metabarcoding are now widely used to advance species discovery and biodiversity assessments. High-throughput sequencing (HTS) has expanded the volume and scope of these analyses, but elevated error rates introduce noise into sequence records that can inflate estimates of biodiversity. Denoising —the separation of biological signal from instrument (technical) noise—of barcode and metabarcode data currently employs abundance-based methods which do not capitalize on the highly conserved structure of the cytochrome c oxidase subunit I (COI) region employed as the animal barcode. This manuscript introduces debar, an R package that utilizes a profile hidden Markov model to denoise indel errors in COI sequences introduced by instrument error. In silico studies demonstrated that debar recognized 95% of artificially introduced indels in COI sequences. When applied to real-world data, debar reduced indel errors in circular consensus sequences obtained with the Sequel platform by 75%, and those generated on the Ion Torrent S5 by 94%. The false correction rate was less than 0.1%, indicating that debar is receptive to the majority of true COI variation in the animal kingdom. In conclusion, the debar package improves DNA barcode and metabarcode workflows by aiding the generation of more accurate sequences aiding the characterization of species diversity.

中文翻译:

Debar:用于 COI-5P DNA 条形码数据的逐序列降噪器

DNA 条形码和元条形码现在被广泛用于促进物种发现和生物多样性评估。高通量测序 (HTS) 扩大了这些分析的数量和范围,但错误率升高会给序列记录带来噪音,从而夸大对生物多样性的估计。去噪——将生物信号与仪器(技术)噪声分离——条形码和元条形码数据目前采用基于丰度的方法,这些方法没有利用细胞色素c的高度保守结构氧化酶亚基 I (COI) 区域用作动物条形码。这份手稿介绍了 debar,这是一个 R 包,它利用配置文件隐藏马尔可夫模型对仪器误差引入的 COI 序列中的 indel 错误进行去噪。计算机研究表明,debar 可识别 COI 序列中 95% 的人工引入插入缺失。当应用于现实世界的数据时,debar 将使用 Sequel 平台获得的循环共有序列中的 indel 错误减少了 75%,并将在 Ion Torrent S5 上生成的错误减少了 94%。错误校正率小于 0.1%,表明 debar 能够接受动物界中大部分真实的 COI 变化。总之,debar 包通过帮助生成更准确的序列来帮助表征物种多样性,从而改进了 DNA 条形码和元条形码工作流程。
更新日期:2021-03-22
down
wechat
bug