当前位置: X-MOL 学术Neural Comput. & Applic. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
New evaluation methods of read mapping by 17 aligners on simulated and empirical NGS data: an updated comparison of DNA- and RNA-Seq data from Illumina and Ion Torrent technologies
Neural Computing and Applications ( IF 4.5 ) Pub Date : 2021-06-16 , DOI: 10.1007/s00521-021-06188-z
Luigi Donato 1, 2 , Concetta Scimone 1, 2 , Carmela Rinaldi 1 , Rosalia D'Angelo 1 , Antonina Sidoti 1
Affiliation  

During the last (15) years, improved omics sequencing technologies have expanded the scale and resolution of various biological applications, generating high-throughput datasets that require carefully chosen software tools to be processed. Therefore, following the sequencing development, bioinformatics researchers have been challenged to implement alignment algorithms for next-generation sequencing reads. However, nowadays selection of aligners based on genome characteristics is poorly studied, so our benchmarking study extended the “state of art” comparing 17 different aligners. The chosen tools were assessed on empirical human DNA- and RNA-Seq data, as well as on simulated datasets in human and mouse, evaluating a set of parameters previously not considered in such kind of benchmarks. As expected, we found that each tool was the best in specific conditions. For Ion Torrent single-end RNA-Seq samples, the most suitable aligners were CLC and BWA-MEM, which reached the best results in terms of efficiency, accuracy, duplication rate, saturation profile and running time. About Illumina paired-end osteomyelitis transcriptomics data, instead, the best performer algorithm, together with the already cited CLC, resulted Novoalign, which excelled in accuracy and saturation analyses. Segemehl and DNASTAR performed the best on both DNA-Seq data, with Segemehl particularly suitable for exome data. In conclusion, our study could guide users in the selection of a suitable aligner based on genome and transcriptome characteristics. However, several other aspects, emerged from our work, should be considered in the evolution of alignment research area, such as the involvement of artificial intelligence to support cloud computing and mapping to multiple genomes.



中文翻译:

17 位校准器对模拟和经验 NGS 数据进行读取映射的新评估方法:来自 Illumina 和 Ion Torrent 技术的 DNA 和 RNA-Seq 数据的更新比较

在过去 (15) 年中,改进的组学测序技术扩大了各种生物应用的规模和分辨率,生成了需要精心挑选的软件工具进行处理的高通量数据集。因此,随着测序的发展,生物信息学研究人员面临着为下一代测序读数实施比对算法的挑战。然而,如今基于基因组特征选择比对器的研究很少,因此我们的基准研究扩展了比较 17 种不同比对器的“最新技术”。所选工具在人类 DNA 和 RNA-Seq 数据以及人类和小鼠的模拟数据集上进行了评估,评估了一组以前在此类基准中未考虑的参数。正如预期的那样,我们发现每种工具在特定条件下都是最好的。对于 Ion Torrent 单端 RNA-Seq 样品,最合适的对齐器是 CLC 和 BWA-MEM,它们在效率、准确性、重复率、饱和度曲线和运行时间方面达到了最佳结果。相反,关于 Illumina 配对末端骨髓炎转录组学数据,表现最佳的算法与已经引用的 CLC 一起产生了 Novoalign,它在准确性和饱和度分析方面表现出色。Segemehl 和 DNASTAR 在 DNA-Seq 数据上表现最好,其中 Segemehl 特别适合外显子组数据。总之,我们的研究可以指导用户根据基因组和转录组特征选择合适的比对器。然而,我们的工作中出现的其他几个方面,应该在对齐研究领域的发展中加以考虑,

更新日期:2021-06-16
down
wechat
bug