当前位置: X-MOL 学术Hum. Genome Var. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Comparative evaluation of SNVs, indels, and structural variations detected with short- and long-read sequencing data
Human Genome Variation Pub Date : 2024-04-17 , DOI: 10.1038/s41439-024-00276-x
Shunichi Kosugi , Chikashi Terao

Short- and long-read sequencing technologies are routinely used to detect DNA variants, including SNVs, indels, and structural variations (SVs). However, the differences in the quality and quantity of variants detected between short- and long-read data are not fully understood. In this study, we comprehensively evaluated the variant calling performance of short- and long-read-based SNV, indel, and SV detection algorithms (6 for SNVs, 12 for indels, and 13 for SVs) using a novel evaluation framework incorporating manual visual inspection. The results showed that indel-insertion calls greater than 10 bp were poorly detected by short-read-based detection algorithms compared to long-read-based algorithms; however, the recall and precision of SNV and indel-deletion detection were similar between short- and long-read data. The recall of SV detection with short-read-based algorithms was significantly lower in repetitive regions, especially for small- to intermediate-sized SVs, than that detected with long-read-based algorithms. In contrast, the recall and precision of SV detection in nonrepetitive regions were similar between short- and long-read data. These findings suggest the need for refined strategies, such as incorporating multiple variant detection algorithms, to generate a more complete set of variants using short-read data.



中文翻译:

使用短读长和长读长测序数据检测到的 SNV、插入缺失和结构变异的比较评估

短读长和长读长测序技术通常用于检测 DNA 变异,包括 SNV、插入缺失和结构变异 (SV)。然而,短读数据和长读数据之间检测到的变异的质量和数量的差异尚未完全了解。在本研究中,我们使用结合手动视觉的新颖评估框架,全面评估了基于短读和长读的 SNV、indel 和 SV 检测算法(SNV 6 个、indel 12 个、SV 13 个)的变体调用性能。检查。结果表明,与基于长读长的算法相比,基于短读长的检测算法对大于 10 bp 的插入缺失调用的检测效果较差;然而,短读数据和长读数据的 SNV 和插入缺失检测的召回率和精确度相似。使用基于短读的算法检测到的 SV 在重复区域中的召回率显着低于使用基于长读的算法检测到的值,特别是对于中小型 SV。相比之下,短读数据和长读数据之间非重复区域 SV 检测的召回率和精度相似。这些发现表明需要改进策略,例如结合多种变异检测算法,以使用短读数据生成更完整的变异集。

更新日期:2024-04-17
down
wechat
bug