当前位置: X-MOL 学术Brief. Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Reference genome and annotation updates lead to contradictory prognostic predictions in gene expression signatures: a case study of resected stage I lung adenocarcinoma.
Briefings in Bioinformatics ( IF 6.8 ) Pub Date : 2020-05-08 , DOI: 10.1093/bib/bbaa081
Zheyang Zhang , Sainan Zhang , Xin Li , Zhangxiang Zhao , Changjing Chen , Juxuan Zhang , Mengyue Li , Zixin Wei , Wenbin Jiang , Bo Pan , Ying Li , Yixin Liu , Yingyue Cao , Wenyuan Zhao , Yunyan Gu , Yan Yu , Qingwei Meng , Lishuang Qi

RNA-sequencing enables accurate and low-cost transcriptome-wide detection. However, expression estimates vary as reference genomes and gene annotations are updated, confounding existing expression-based prognostic signatures. Herein, prognostic 9-gene pair signature (GPS) was applied to 197 patients with stage I lung adenocarcinoma derived from previous and latest data from The Cancer Genome Atlas (TCGA) processed with different reference genomes and annotations. For 9-GPS, 6.6% of patients exhibited discordant risk classifications between the two TCGA versions. Similar results were observed for other prognostic signatures, including IRGPI, 15-gene and ORACLE. We found that conflicting annotations for gene length and overlap were the major cause of their discordant risk classification. Therefore, we constructed a prognostic 40-GPS based on stable genes across GENCODE v20-v30 and validated it using public data of 471 stage I samples (log-rank P < 0.0010). Risk classification was still stable in RNA-sequencing data processed with the newest GENCODE v32 versus GENCODE v20-v30. Specifically, 40-GPS could predict survival for 30 stage I samples with formalin-fixed paraffin-embedded tissues (log-rank P = 0.0177). In conclusion, this method overcomes the vulnerability of existing prognostic signatures due to reference genome and annotation updates. 40-GPS may offer individualized clinical applications due to its prognostic accuracy and classification stability.

中文翻译:

参考基因组和注释更新导致基因表达特征中相互矛盾的预后预测:切除的 I 期肺腺癌病例研究。

RNA 测序可实现准确且低成本的全转录组检测。然而,表达估计随着参考基因组和基因注释的更新而变化,混淆了现有的基于表达的预后特征。在此,预后 9 基因对标记 (GPS) 应用于 197 名 I 期肺腺癌患者,这些患者来自癌症基因组图谱 (TCGA) 的先前和最新数据,这些数据使用不同的参考基因组和注释进行处理。对于 9-GPS,6.6% 的患者在两个 TCGA 版本之间表现出不一致的风险分类。对于其他预后特征,包括 IRGPI、15 基因和 ORACLE,也观察到了类似的结果。我们发现基因长度和重叠的冲突注释是其风险分类不一致的主要原因。所以,我们基于跨 GENCODE v20-v30 的稳定基因构建了一个预后 40-GPS,并使用 471 个 I 阶段样本的公共数据对其进行了验证(对数秩 P < 0.0010)。在使用最新的 GENCODE v32 与 GENCODE v20-v30 处理的 RNA 测序数据中,风险分类仍然稳定。具体来说,40-GPS 可以预测福尔马林固定石蜡包埋组织的 30 个 I 期样本的存活率(对数秩 P = 0.0177)。总之,该方法克服了由于参考基因组和注释更新而导致的现有预后特征的脆弱性。由于其预后准确性和分类稳定性,40-GPS 可以提供个性化的临床应用。在使用最新的 GENCODE v32 与 GENCODE v20-v30 处理的 RNA 测序数据中,风险分类仍然稳定。具体来说,40-GPS 可以预测福尔马林固定石蜡包埋组织的 30 个 I 期样本的存活率(对数秩 P = 0.0177)。总之,该方法克服了由于参考基因组和注释更新而导致的现有预后特征的脆弱性。由于其预后准确性和分类稳定性,40-GPS 可以提供个性化的临床应用。在使用最新的 GENCODE v32 与 GENCODE v20-v30 处理的 RNA 测序数据中,风险分类仍然稳定。具体来说,40-GPS 可以预测福尔马林固定石蜡包埋组织的 30 个 I 期样本的存活率(对数秩 P = 0.0177)。总之,该方法克服了由于参考基因组和注释更新而导致的现有预后特征的脆弱性。由于其预后准确性和分类稳定性,40-GPS 可以提供个性化的临床应用。
更新日期:2020-05-08
down
wechat
bug