当前位置: X-MOL 学术BMC Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Uncovering the prognostic gene signatures for the improvement of risk stratification in cancers by using deep learning algorithm coupled with wavelet transform.
BMC Bioinformatics ( IF 2.9 ) Pub Date : 2020-05-19 , DOI: 10.1186/s12859-020-03544-z
Yiru Zhao 1 , Yifan Zhou 2 , Yuan Liu 2 , Yinyi Hao 2 , Menglong Li 2 , Xuemei Pu 2 , Chuan Li 1 , Zhining Wen 2, 3
Affiliation  

BACKGROUND The aim of gene expression-based clinical modelling in tumorigenesis is not only to accurately predict the clinical endpoints, but also to reveal the genome characteristics for downstream analysis for the purpose of understanding the mechanisms of cancers. Most of the conventional machine learning methods involved a gene filtering step, in which tens of thousands of genes were firstly filtered based on the gene expression levels by a statistical method with an arbitrary cutoff. Although gene filtering procedure helps to reduce the feature dimension and avoid overfitting, there is a risk that some pathogenic genes important to the disease will be ignored. RESULTS In this study, we proposed a novel deep learning approach by combining a convolutional neural network with stationary wavelet transform (SWT-CNN) for stratifying cancer patients and predicting their clinical outcomes without gene filtering based on tumor genomic profiles. The proposed SWT-CNN overperformed the state-of-art algorithms, including support vector machine (SVM) and logistic regression (LR), and produced comparable prediction performance to random forest (RF). Furthermore, for all the cancer types, we firstly proposed a method to weight the genes with the scores, which took advantage of the representative features in the hidden layer of convolutional neural network, and then selected the prognostic genes for the Cox proportional-hazards regression. The results showed that risk stratifications can be effectively improved by using the identified prognostic genes as feature, indicating that the representative features generated by SWT-CNN can well correlate the genes with prognostic risk in cancers and be helpful for selecting the prognostic gene signatures. CONCLUSIONS Our results indicated that gene expression-based SWT-CNN model can be an excellent tool for stratifying the prognostic risk for cancer patients. In addition, the representative features of SWT-CNN were validated to be useful for evaluating the importance of the genes in the risk stratification and can be further used to identify the prognostic gene signatures.

中文翻译:

通过使用深度学习算法与小波变换相结合,发现可改善癌症风险分层的预后基因特征。

背景技术基于基因表达的临床模型在肿瘤发生中的目的不仅在于准确预测临床终点,而且在于揭示基因组特征以用于下游分析,以了解癌症的机制。大多数传统的机器学习方法都涉及一个基因过滤步骤,其中首先通过具有任意临界值的统计方法根据基因表达水平过滤成千上万个基因。尽管基因过滤程序有助于减小特征尺寸并避免过度拟合,但存在一些对疾病重要的致病基因将被忽略的风险。结果在这项研究中,我们通过将卷积神经网络与平稳小波变换(SWT-CNN)相结合提出了一种新颖的深度学习方法,用于对癌症患者进行分层并预测其临床结果,而无需根据肿瘤基因组概况进行基因过滤。拟议的SWT-CNN优于包括支持向量机(SVM)和逻辑回归(LR)在内的最新算法,并产生了与随机森林(RF)相当的预测性能。此外,针对所有癌症类型,我们首先提出一种利用得分加权基因的方法,该方法利用卷积神经网络隐层的代表性特征,然后选择用于Cox比例风险回归的预后基因。 。结果表明,通过使用鉴定的预后基因作为特征可以有效地改善风险分层,这表明SWT-CNN生成的代表性特征可以很好地将基因与癌症的预后风险相关联,并有助于选择预后基因签名。结论我们的结果表明,基于基因表达的SWT-CNN模型可以很好地用于对癌症患者的预后风险进行分层。此外,已验证SWT-CNN的代表性特征可用于评估基因在风险分层中的重要性,并可进一步用于鉴定预后基因特征。这表明SWT-CNN产生的代表性特征可以很好地将基因与癌症的预后风险相关,并有助于选择预后基因特征。结论我们的结果表明,基于基因表达的SWT-CNN模型可以很好地用于对癌症患者的预后风险进行分层。此外,已验证SWT-CNN的代表性特征可用于评估基因在风险分层中的重要性,并可进一步用于鉴定预后基因特征。这表明SWT-CNN产生的代表性特征可以很好地将基因与癌症的预后风险相关联,并有助于选择预后基因签名。结论我们的结果表明,基于基因表达的SWT-CNN模型可以很好地用于对癌症患者的预后风险进行分层。此外,已验证SWT-CNN的代表性特征可用于评估基因在风险分层中的重要性,并可进一步用于鉴定预后基因特征。
更新日期:2020-05-19
down
wechat
bug