当前位置: X-MOL 学术Genome Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Alignment and mapping methodology influence transcript abundance estimation
Genome Biology ( IF 12.3 ) Pub Date : 2020-09-07 , DOI: 10.1186/s13059-020-02151-8
Avi Srivastava 1 , Laraib Malik 1 , Hirak Sarkar 2 , Mohsen Zakeri 2 , Fatemeh Almodaresi 2 , Charlotte Soneson 3, 4 , Michael I Love 5, 6 , Carl Kingsford 7 , Rob Patro 2
Affiliation  

Background The accuracy of transcript quantification using RNA-seq data depends on many factors, such as the choice of alignment or mapping method and the quantification model being adopted. While the choice of quantification model has been shown to be important, considerably less attention has been given to comparing the effect of various read alignment approaches on quantification accuracy. Results We investigate the influence of mapping and alignment on the accuracy of transcript quantification in both simulated and experimental data, as well as the effect on subsequent differential expression analysis. We observe that, even when the quantification model itself is held fixed, the effect of choosing a different alignment methodology, or aligning reads using different parameters, on quantification estimates can sometimes be large and can affect downstream differential expression analyses as well. These effects can go unnoticed when assessment is focused too heavily on simulated data, where the alignment task is often simpler than in experimentally acquired samples. We also introduce a new alignment methodology, called selective alignment, to overcome the shortcomings of lightweight approaches without incurring the computational cost of traditional alignment. Conclusion We observe that, on experimental datasets, the performance of lightweight mapping and alignment-based approaches varies significantly, and highlight some of the underlying factors. We show this variation both in terms of quantification and downstream differential expression analysis. In all comparisons, we also show the improved performance of our proposed selective alignment method and suggest best practices for performing RNA-seq quantification.

中文翻译:

比对和作图方法影响转录本丰度估计

背景 使用 RNA-seq 数据进行转录本量化的准确性取决于许多因素,例如比对或映射方法的选择以及所采用的量化模型。虽然量化模型的选择已被证明很重要,但对比较各种读取对齐方法对量化准确性的影响的关注却少得多。结果我们研究了映射和比对对模拟和实验数据中转录本量化准确性的影响,以及对后续差异表达分析的影响。我们观察到,即使量化模型本身保持固定,选择不同的对齐方法或使用不同参数对齐读数的效果,量化估计有时可能很大,也会影响下游差异表达分析。当评估过于关注模拟数据时,这些影响可能会被忽视,在这种情况下,对齐任务通常比实验获得的样本更简单。我们还引入了一种新的对齐方法,称为选择性对齐,以克服轻量级方法的缺点,而不会产生传统对齐的计算成本。结论我们观察到,在实验数据集上,轻量级映射和基于对齐的方法的性能差异很大,并突出了一些潜在因素。我们在量化和下游差异表达分析方面都显示了这种变化。在所有的比较中,
更新日期:2020-09-07
down
wechat
bug