当前位置: X-MOL 学术Genome Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
AIDE: annotation-assisted isoform discovery with high precision.
Genome Research ( IF 7 ) Pub Date : 2019-11-06 , DOI: 10.1101/gr.251108.119
Wei Vivian Li 1, 2 , Shan Li 3 , Xin Tong 4 , Ling Deng 5 , Hubing Shi 3 , Jingyi Jessica Li 2, 6
Affiliation  

Genome-wide accurate identification and quantification of full-length mRNA isoforms is crucial for investigating transcriptional and posttranscriptional regulatory mechanisms of biological phenomena. Despite continuing efforts in developing effective computational tools to identify or assemble full-length mRNA isoforms from second-generation RNA-seq data, it remains a challenge to accurately identify mRNA isoforms from short sequence reads owing to the substantial information loss in RNA-seq experiments. Here, we introduce a novel statistical method, annotation-assisted isoform discovery (AIDE), the first approach that directly controls false isoform discoveries by implementing the testing-based model selection principle. Solving the isoform discovery problem in a stepwise and conservative manner, AIDE prioritizes the annotated isoforms and precisely identifies novel isoforms whose addition significantly improves the explanation of observed RNA-seq reads. We evaluate the performance of AIDE based on multiple simulated and real RNA-seq data sets followed by PCR-Sanger sequencing validation. Our results show that AIDE effectively leverages the annotation information to compensate the information loss owing to short read lengths. AIDE achieves the highest precision in isoform discovery and the lowest error rates in isoform abundance estimation, compared with three state-of-the-art methods Cufflinks, SLIDE, and StringTie. As a robust bioinformatics tool for transcriptome analysis, AIDE enables researchers to discover novel transcripts with high confidence.

中文翻译:

AIDE:高精度的注释辅助同工型发现。

全基因组范围的全长mRNA同工型的准确鉴定和定量对于研究生物现象的转录和转录后调控机制至关重要。尽管继续努力开发有效的计算工具以从第二代RNA-seq数据中识别或组装全长mRNA异构体,但由于RNA-seq实验中的大量信息丢失,从短序列读数中准确识别mRNA异构体仍然是一个挑战。 。在这里,我们介绍一种新颖的统计方法,即注释辅助同工型发现(AIDE),这是通过实施基于测试的模型选择原理直接控制假同工型发现的第一种方法。逐步保守地解决同工型发现问题,AIDE优先处理带注释的同工型,并精确鉴定新的同工型,其添加显着改善了观察到的RNA-seq读数的解释。我们基于多个模拟和真实RNA-seq数据集,然后进行PCR-Sanger测序验证,评估AIDE的性能。我们的结果表明,AIDE有效利用注释信息来补偿由于读取长度短而引起的信息丢失。与三种最先进的方法Cufflinks,SLIDE和StringTie相比,AIDE在异构体发现中实现了最高的精确度,并且在异构体丰度估计中实现了最低的错误率。作为用于转录组分析的强大生物信息学工具,AIDE使研究人员能够高度自信地发现新颖的转录本。
更新日期:2019-11-01
down
wechat
bug