当前位置: X-MOL 学术BMC Genomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Assessing graph-based read mappers against a baseline approach highlights strengths and weaknesses of current methods
BMC Genomics ( IF 4.4 ) Pub Date : 2020-04-06 , DOI: 10.1186/s12864-020-6685-y
Ivar Grytten , Knut D. Rand , Alexander J. Nederbragt , Geir K. Sandve

Graph-based reference genomes have become popular as they allow read mapping and follow-up analyses in settings where the exact haplotypes underlying a high-throughput sequencing experiment are not precisely known. Two recent papers show that mapping to graph-based reference genomes can improve accuracy as compared to methods using linear references. Both of these methods index the sequences for most paths up to a certain length in the graph in order to enable direct mapping of reads containing common variants. However, the combinatorial explosion of possible paths through nearby variants also leads to a huge search space and an increased chance of false positive alignments to highly variable regions. We here assess three prominent graph-based read mappers against a hybrid baseline approach that combines an initial path determination with a tuned linear read mapping method. We show, using a previously proposed benchmark, that this simple approach is able to improve overall accuracy of read-mapping to graph-based reference genomes. Our method is implemented in a tool Two-step Graph Mapper, which is available at https://github.com/uio-bmi/two_step_graph_mapperalong with data and scripts for reproducing the experiments. Our method highlights characteristics of the current generation of graph-based read mappers and shows potential for improvement for future graph-based read mappers.

中文翻译:

根据基准方法评估基于图的读取映射器,突出显示了当前方法的优缺点

基于图的参考基因组已变得流行,因为它们允许在高通量测序实验所依据的确切单倍型未知的环境中进行读图和后续分析。最近的两篇论文表明,与使用线性参考的方法相比,映射到基于图的参考基因组可以提高准确性。这两种方法都对图中最多一定长度的大多数路径的序列进行索引,以便能够直接映射包含常见变体的读段。但是,通过附近变体的可能路径的组合爆炸还会导致巨大的搜索空间,并且会增加对高度可变区域的假阳性比对的机会。我们在这里针对混合基线方法评估了三个基于图形的突出读取映射器,该混合基线方法将初始路径确定与调整的线性读取映射方法结合在一起。我们显示,使用以前提出的基准,这种简单的方法能够提高基于图的参考基因组的读取映射的整体准确性。我们的方法是在工具“两步图映射器”中实现的,该工具可从https://github.com/uio-bmi/two_step_graph_mapperalong获得,其中包含用于再现实验的数据和脚本。我们的方法突出了当前基于图的读取映射器的当前特性,并显示了未来基于图的读取映射器的改进潜力。这种简单的方法能够提高基于图的参考基因组的读取映射的整体准确性。我们的方法是在工具“两步图映射器”中实现的,该工具可从https://github.com/uio-bmi/two_step_graph_mapperalong获得,其中包含用于再现实验的数据和脚本。我们的方法突出了当前基于图的读取映射器的当前特性,并显示了未来基于图的读取映射器的改进潜力。这种简单的方法能够提高基于图的参考基因组的读取映射的整体准确性。我们的方法是在工具“两步图映射器”中实现的,该工具可从https://github.com/uio-bmi/two_step_graph_mapperalong获得,其中包含用于再现实验的数据和脚本。我们的方法突出了当前基于图的读取映射器的当前特性,并显示了未来基于图的读取映射器的改进潜力。
更新日期:2020-04-22
down
wechat
bug