当前位置: X-MOL 学术J. Comput. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
More Accurate Transcript Assembly via Parameter Advising.
Journal of Computational Biology ( IF 1.7 ) Pub Date : 2020-08-04 , DOI: 10.1089/cmb.2019.0286
Dan Deblasio 1, 2 , Kwanho Kim 1, 3 , Carl Kingsford 1
Affiliation  

Computational tools used for genomic analyses are becoming more accurate but also increasingly sophisticated and complex. This introduces a new problem in that these pieces of software have a large number of tunable parameters that often have a large influence on the results that are reported. We quantify the impact of parameter choice on transcript assembly and take some first steps toward generating a truly automated genomic analysis pipeline by developing a method for automatically choosing input-specific parameter values for reference-based transcript assembly using the Scallop tool. By choosing parameter values for each input, the area under the receiver operator characteristic curve (AUC) when comparing assembled transcripts to a reference transcriptome is increased by an average of 28.9% over using only the default parameter choices on 1595 RNA-Seq samples in the Sequence Read Archive. This approach is general, and when applied to StringTie, it increases the AUC by an average of 13.1% on a set of 65 RNA-Seq experiments from ENCODE. Parameter advisors for both Scallop and StringTie are available on Github.

中文翻译:

通过参数建议获得更准确的成绩单汇编。

用于基因组分析的计算工具变得越来越准确,但也越来越复杂和复杂。这引入了一个新问题,因为这些软件具有大量可调参数,这些参数通常对报告的结果有很大影响。我们量化了参数选择对转录本组装的影响,并通过开发一种使用 Scallop 工具为基于参考的转录本组装自动选择输入特定参数值的方法,迈出了生成真正自动化基因组分析管道的第一步。通过为每个输入选择参数值,将组装的转录本与参考转录组进行比较时,接收者操作特征曲线 (AUC) 下的面积平均增加了 28。对 Sequence Read Archive 中的 1595 个 RNA-Seq 样本仅使用默认参数选项高 9%。这种方法是通用的,当应用于 StringTie 时,它​​在来自 ENCODE 的一组 65 个 RNA-Seq 实验中平均增加了 13.1% 的 AUC。Github 上提供了 Scallop 和 StringTie 的参数顾问.
更新日期:2020-08-08
down
wechat
bug