当前位置: X-MOL 学术J. Bioinform. Comput. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Improvement of detection performance of fusion genes from RNA-seq data by clustering short reads
Journal of Bioinformatics and Computational Biology ( IF 0.9 ) Pub Date : 2019-05-10 , DOI: 10.1142/s0219720019400080
Yoshiaki Sota 1, 2 , Shigeto Seno 1 , Hironori Shigeta 1 , Naoki Osato 1 , Masafumi Shimoda 2 , Shinzaburo Noguchi 2 , Hideo Matsuda 1
Affiliation  

Fusion genes are involved in cancer, and their detection using RNA-Seq is insufficient given the relatively short reading length. Therefore, we proposed a shifted short-read clustering (SSC) method, which focuses on overlapping reads from the same loci and extends them as a representative sequence. To verify their usefulness, we applied the SSC method to RNA-Seq data from four types of cell lines (BT-474, MCF-7, SKBR-3, and T-47D). As the slide width of the SSC method increased to one, two, five, or ten bases, the read length was extended from 201 bases to 217 (108%), 234 (116%), 282 (140%), or 317 (158%) bases, respectively. Furthermore, fusion genes were investigated using STAR-Fusion, a fusion gene detection tool, with and without the SSC method. When one base was shifted by the SSC method, the reads mapped to multiple loci decreased from 9.7% to 4.6%, and the sensitivity of the fusion gene was improved from 47% to 54% on average (BT-474: from 48% to 57%, MCF-7: 49% to 53%, SKBR-3: 50% to 57%, and T-47D: 43% to 50%) compared with original data. When the reads are shifted more, the positive predictive value was also improved. The SSC method could be an effective method for fusion gene detection.

中文翻译:

通过聚类短读段提高 RNA-seq 数据中融合基因的检测性能

融合基因与癌症有关,鉴于阅读长度相对较短,使用 RNA-Seq 对其进行检测是不够的。因此,我们提出了一种移位短读长聚类(SSC)方法,该方法侧重于来自同一位点的重叠读长,并将它们扩展为代表序列。为了验证它们的实用性,我们将 SSC 方法应用于来自四种细胞系(BT-474、MCF-7、SKBR-3 和 T-47D)的 RNA-Seq 数据。随着 SSC 方法的载玻片宽度增加到 1、2、5 或 10 个碱基,读长从 201 个碱基扩展到 217 (108%)、234 (116%)、282 (140%) 或 317 ( 158%) 碱基。此外,使用融合基因检测工具 STAR-Fusion 研究融合基因,使用和不使用 SSC 方法。当通过 SSC 方法移动一个碱基时,映射到多个位点的读数从 9 减少。7% 到 4.6%,融合基因的敏感性平均从 47% 提高到 54%(BT-474:从 48% 到 57%,MCF-7:49% 到 53%,SKBR-3:50 % to 57%, and T-47D: 43% to 50%) 与原始数据相比。当读数偏移更多时,阳性预测值也提高了。SSC法可能是一种有效的融合基因检测方法。
更新日期:2019-05-10
down
wechat
bug