当前位置: X-MOL 学术BMC Med. Genomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Transcription factor expression as a predictor of colon cancer prognosis: a machine learning practice.
BMC Medical Genomics ( IF 2.7 ) Pub Date : 2020-09-21 , DOI: 10.1186/s12920-020-00775-0
Jiannan Liu 1 , Chuanpeng Dong 1, 2 , Guanglong Jiang 1, 2, 3 , Xiaoyu Lu 1, 2 , Yunlong Liu 2, 3 , Huanmei Wu 1, 4
Affiliation  

Colon cancer is one of the leading causes of cancer deaths in the USA and around the world. Molecular level characters, such as gene expression levels and mutations, may provide profound information for precision treatment apart from pathological indicators. Transcription factors function as critical regulators in all aspects of cell life, but transcription factors-based biomarkers for colon cancer prognosis were still rare and necessary. We implemented an innovative process to select the transcription factors variables and evaluate the prognostic prediction power by combining the Cox PH model with the random forest algorithm. We picked five top-ranked transcription factors and built a prediction model by using Cox PH regression. Using Kaplan-Meier analysis, we validated our predictive model on four independent publicly available datasets (GSE39582, GSE17536, GSE37892, and GSE17537) from the GEO database, consisting of 925 colon cancer patients. A five-transcription-factors based predictive model for colon cancer prognosis has been developed by using TCGA colon cancer patient data. Five transcription factors identified for the predictive model is HOXC9, ZNF556, HEYL, HOXC4 and HOXC6. The prediction power of the model is validated with four GEO datasets consisting of 1584 patient samples. Kaplan-Meier curve and log-rank tests were conducted on both training and validation datasets, the difference of overall survival time between predicted low and high-risk groups can be clearly observed. Gene set enrichment analysis was performed to further investigate the difference between low and high-risk groups in the gene pathway level. The biological meaning was interpreted. Overall, our results prove our prediction model has a strong prediction power on colon cancer prognosis. Transcription factors can be used to construct colon cancer prognostic signatures with strong prediction power. The variable selection process used in this study has the potential to be implemented in the prognostic signature discovery of other cancer types. Our five TF-based predictive model would help with understanding the hidden relationship between colon cancer patient survival and transcription factor activities. It will also provide more insights into the precision treatment of colon cancer patients from a genomic information perspective.

中文翻译:

转录因子表达作为结肠癌预后的预测指标:机器学习实践。

结肠癌是美国和世界各地癌症死亡的主要原因之一。除了病理指标外,分子水平特征,如基因表达水平和突变,可以为精准治疗提供深刻信息。转录因子在细胞生命的各个方面都起着关键调节剂的作用,但基于转录因子的结肠癌预后生物标志物仍然很少见且必不可少。我们通过将 Cox PH 模型与随机森林算法相结合,实施了一个创新的过程来选择转录因子变量并评估预后预测能力。我们挑选了五个排名靠前的转录因子,并使用 Cox PH 回归构建了预测模型。使用 Kaplan-Meier 分析,我们在来自 GEO 数据库的四个独立的公开可用数据集(GSE39582、GSE17536、GSE37892 和 GSE17537)上验证了我们的预测模型,该数据库由 925 名结肠癌患者组成。已经通过使用 TCGA 结肠癌患者数据开发了基于五转录因子的结肠癌预后预测模型。为预测模型确定的五个转录因子是 HOXC9、ZNF556、HEYL、HOXC4 和 HOXC6。该模型的预测能力通过由 1584 个患者样本组成的四个 GEO 数据集进行验证。对训练和验证数据集进行 Kaplan-Meier 曲线和对数秩检验,可以清楚地观察到预测的低风险组和高风险组之间的总生存时间差异。进行基因集富集分析以进一步研究低风险组和高风险组在基因通路水平上的差异。解释了生物学意义。总的来说,我们的结果证明我们的预测模型对结肠癌预后有很强的预测能力。转录因子可用于构建具有强大预测能力的结肠癌预后特征。本研究中使用的变量选择过程有可能在其他癌症类型的预后特征发现中实施。我们的五个基于 TF 的预测模型将有助于了解结肠癌患者存活率与转录因子活性之间的隐藏关系。它还将从基因组信息的角度为结肠癌患者的精准治疗提供更多见解。
更新日期:2020-09-21
down
wechat
bug