当前位置: X-MOL 学术BMC Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Cancer prognosis prediction using somatic point mutation and copy number variation data: a comparison of gene-level and pathway-based models
BMC Bioinformatics ( IF 2.9 ) Pub Date : 2020-10-20 , DOI: 10.1186/s12859-020-03791-0
Xingyu Zheng 1 , Christopher I Amos 1, 2 , H Robert Frost 1
Affiliation  

Genomic profiling of solid human tumors by projects such as The Cancer Genome Atlas (TCGA) has provided important information regarding the somatic alterations that drive cancer progression and patient survival. Although researchers have successfully leveraged TCGA data to build prognostic models, most efforts have focused on specific cancer types and a targeted set of gene-level predictors. Less is known about the prognostic ability of pathway-level variables in a pan-cancer setting. To address these limitations, we systematically evaluated and compared the prognostic ability of somatic point mutation (SPM) and copy number variation (CNV) data, gene-level and pathway-level models for a diverse set of TCGA cancer types and predictive modeling approaches. We evaluated gene-level and pathway-level penalized Cox proportional hazards models using SPM and CNV data for 29 different TCGA cohorts. We measured predictive accuracy as the concordance index for predicting survival outcomes. Our comprehensive analysis suggests that the use of pathway-level predictors did not offer superior predictive power relative to gene-level models for all cancer types but had the advantages of robustness and parsimony. We identified a set of cohorts for which somatic alterations could not predict prognosis, and a unique cohort LGG, for which SPM data was more predictive than CNV data and the predictive accuracy is good for all model types. We found that the pathway-level predictors provide superior interpretative value and that there is often a serious collinearity issue for the gene-level models while pathway-level models avoided this issue. Our comprehensive analysis suggests that when using somatic alterations data for cancer prognosis prediction, pathway-level models are more interpretable, stable and parsimonious compared to gene-level models. Pathway-level models also avoid the issue of collinearity, which can be serious for gene-level somatic alterations. The prognostic power of somatic alterations is highly variable across different cancer types and we have identified a set of cohorts for which somatic alterations could not predict prognosis. In general, CNV data predicts prognosis better than SPM data with the exception of the LGG cohort.

中文翻译:


使用体细胞点突变和拷贝数变异数据预测癌症预后:基因水平和基于通路的模型的比较



癌症基因组图谱 (TCGA) 等项目对人类实体肿瘤进行基因组分析,提供了有关驱动癌症进展和患者生存的体细胞改变的重要信息。尽管研究人员已成功利用 TCGA 数据建立预后模型,但大多数努力都集中在特定的癌症类型和一组有针对性的基因水平预测因子上。人们对泛癌环境中通路水平变量的预后能力知之甚少。为了解决这些局限性,我们系统地评估和比较了体细胞点突变 (SPM) 和拷贝数变异 (CNV) 数据、不同 TCGA 癌症类型的基因水平和通路水平模型以及预测建模方法的预后能力。我们使用 29 个不同 TCGA 队列的 SPM 和 CNV 数据评估了基因水平和通路水平惩罚 Cox 比例风险模型。我们测量预测准确性作为预测生存结果的一致性指数。我们的综合分析表明,对于所有癌症类型,路径水平预测因子的使用并不能提供相对于基因水平模型更优越的预测能力,但具有鲁棒性和简约性的优点。我们确定了一组体细胞改变无法预测预后的队列,以及一个独特的 LGG 队列,其中 SPM 数据比 CNV 数据更具预测性,并且预测准确性对于所有模型类型都很好。我们发现通路水平的预测因子提供了优越的解释价值,并且基因水平模型经常存在严重的共线性问题,而通路水平模型则避免了这个问题。 我们的综合分析表明,当使用体细胞改变数据进行癌症预后预测时,与基因水平模型相比,通路水平模型更具可解释性、稳定性和简约性。通路水平模型还避免了共线性问题,这对于基因水平体细胞改变可能是严重的。体细胞改变的预后能力在不同癌症类型中差异很大,我们已经确定了一组体细胞改变无法预测预后的队列。一般来说,CNV 数据比 SPM 数据更能预测预后,但 LGG 队列除外。
更新日期:2020-10-20
down
wechat
bug