当前位置: X-MOL 学术J. Comput. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
More Accurate Transcript Assembly via Parameter Advising.
Journal of Computational Biology ( IF 1.4 ) Pub Date : 2020-08-04 , DOI: 10.1089/cmb.2019.0286
Dan Deblasio 1, 2 , Kwanho Kim 1, 3 , Carl Kingsford 1
Affiliation  

Computational tools used for genomic analyses are becoming more accurate but also increasingly sophisticated and complex. This introduces a new problem in that these pieces of software have a large number of tunable parameters that often have a large influence on the results that are reported. We quantify the impact of parameter choice on transcript assembly and take some first steps toward generating a truly automated genomic analysis pipeline by developing a method for automatically choosing input-specific parameter values for reference-based transcript assembly using the Scallop tool. By choosing parameter values for each input, the area under the receiver operator characteristic curve (AUC) when comparing assembled transcripts to a reference transcriptome is increased by an average of 28.9% over using only the default parameter choices on 1595 RNA-Seq samples in the Sequence Read Archive. This approach is general, and when applied to StringTie, it increases the AUC by an average of 13.1% on a set of 65 RNA-Seq experiments from ENCODE. Parameter advisors for both Scallop and StringTie are available on Github.

中文翻译:


通过参数建议实现更准确的转录组装。



用于基因组分析的计算工具变得越来越准确,但也越来越复杂。这引入了一个新问题,因为这些软件具有大量可调参数,这些参数通常对报告的结果有很大影响。我们量化了参数选择对转录本组装的影响,并通过开发一种使用 Scallop 工具自动选择基于参考的转录本组装的输入特定参数值的方法,为生成真正自动化的基因组分析流程迈出了一些第一步。通过为每个输入选择参数值,将组装的转录本与参考转录组进行比较时,接收者算子特征曲线 (AUC) 下的面积比仅使用默认参数选择的 1595 个 RNA-Seq 样本平均增加了 28.9%。序列读取存档。这种方法很通用,当应用于 StringTie 时,在 ENCODE 的一组 65 个 RNA-Seq 实验中,它的 AUC 平均增加了 13.1%。 Scallop 和 StringTie 的参数顾问程序可在 Github 上找到。
更新日期:2020-08-08
down
wechat
bug