当前位置: X-MOL 学术Front. Bioeng. Biotech. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Neural Network Framework for Predicting the Tissue-of-Origin of 15 Common Cancer Types Based on RNA-Seq Data
Frontiers in Bioengineering and Biotechnology ( IF 4.3 ) Pub Date : 2020-08-05 , DOI: 10.3389/fbioe.2020.00737
Binsheng He 1 , Yanxiang Zhang 2 , Zhen Zhou 3 , Bo Wang 2 , Yuebin Liang 2 , Jidong Lang 2 , Huixin Lin 2 , Pingping Bing 1 , Lan Yu 4 , Dejun Sun 4 , Huaiqing Luo 1 , Jialiang Yang 1, 2 , Geng Tian 2
Affiliation  

Sequencing-based identification of tumor tissue-of-origin (TOO) is critical for patients with cancer of unknown primary lesions. Even if the TOO of a tumor can be diagnosed by clinicopathological observation, reevaluations by computational methods can help avoid misdiagnosis. In this study, we developed a neural network (NN) framework using the expression of a 150-gene panel to infer the tumor TOO for 15 common solid tumor cancer types, including lung, breast, liver, colorectal, gastroesophageal, ovarian, cervical, endometrial, pancreatic, bladder, head and neck, thyroid, prostate, kidney, and brain cancers. To begin with, we downloaded the RNA-Seq data of 7,460 primary tumor samples across the above mentioned 15 cancer types, with each type of cancer having between 142 and 1,052 samples, from the cancer genome atlas. Then, we performed feature selection by the Pearson correlation method and performed a 150-gene panel analysis; the genes were significantly enriched in the GO:2001242 Regulation of intrinsic apoptotic signaling pathway and the GO:0009755 Hormone-mediated signaling pathway and other similar functions. Next, we developed a novel NN model using the 150 genes to predict tumor TOO for the 15 cancer types. The average prediction sensitivity and precision of the framework are 93.36 and 94.07%, respectively, for the 7,460 tumor samples based on the 10-fold cross-validation; however, the prediction sensitivity and precision for a few specific cancers, like prostate cancer, reached 100%. We also tested the trained model on a 20-sample independent dataset with metastatic tumor, and achieved an 80% accuracy. In summary, we present here a highly accurate method to infer tumor TOO, which has potential clinical implementation.

中文翻译:


基于 RNA-Seq 数据预测 15 种常见癌症类型起源组织的神经网络框架



基于测序的肿瘤组织起源(TOO)鉴定对于原发灶未知的癌症患者至关重要。即使肿瘤的TOO可以通过临床病理学观察来诊断,通过计算方法重新评估也可以帮助避免误诊。在这项研究中,我们开发了一个神经网络 (NN) 框架,使用 150 个基因组的表达来推断 15 种常见实体瘤癌症类型的肿瘤 TOO,包括肺癌、乳腺癌、肝癌、结直肠癌、胃食管癌、卵巢癌、宫颈癌、子宫内膜癌、胰腺癌、膀胱癌、头颈癌、甲状腺癌、前列腺癌、肾癌和脑癌。首先,我们从癌症基因组图谱中下载了上述 15 种癌症类型的 7,460 个原发性肿瘤样本的 RNA-Seq 数据,每种癌症类型有 142 至 1,052 个样本。然后,我们通过Pearson相关法进行特征选择,并进行150个基因的面板分析; GO:2001242 内在凋亡信号通路的调控和 GO:0009755 激素介导的信号通路以及其他类似功能的基因显着富集。接下来,我们开发了一种新型 NN 模型,使用 150 个基因来预测 15 种癌症类型的肿瘤。基于10倍交叉验证,该框架对于7,460个肿瘤样本的平均预测灵敏度和精度分别为93.36和94.07%;然而,对一些特定癌症(如前列腺癌)的预测灵敏度和精确度达到了 100%。我们还在具有转移性肿瘤的 20 个样本独立数据集上测试了训练后的模型,并达到了 80% 的准确率。总之,我们在这里提出了一种高度准确的推断肿瘤 TOO 的方法,具有潜在的临床应用前景。
更新日期:2020-08-05
down
wechat
bug