当前位置: X-MOL 学术medRxiv. Genet. Genom. Med. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Two-stage biologically interpretable neural-network models for liver cancer prognosis prediction using histopathology and transcriptomic data
medRxiv - Genetic and Genomic Medicine Pub Date : 2021-02-20 , DOI: 10.1101/2020.01.25.20016832
Zhucheng Zhan , Zheng Jing , Bing He , Noshad Hosseini , Maria Westerhoff , Eun-Young Choi , Lana X. Garmire

Purpose: Pathological images are easily accessible data with the potential as prognostic biomarkers. Moreover, integration of heterogeneous data types from multi-modality, such as pathological image and gene expression data, is invaluable to help predicting cancer patient survival. However, the analytical challenges are significant. Experimental Design: Here we take the hepatocellular carcinoma (HCC) pathological image features extracted by CellProfiler, and apply them as the input for Cox-nnet, a neural network-based prognosis. We compare this model with conventional Cox-PH model, CoxBoost, Random Survival Forests and DeepSurv, using C-index and log ranked p-values on HCC testing samples. Further, to integrate pathological image and gene expression data of the same patients, we innovatively construct a two-stage Cox-nnet model, and compare it with another complex neural network model PAGE-Net. Results: pathological image based prognosis prediction using Cox-nnet is significantly more accurate than Cox-PH and random survival forests models, and comparable with CoxBoost and DeepSurv methods. Additionally, the two-stage Cox-nnet complex model combining histopathology image and transcriptomics RNA-Seq data achieves better prognosis prediction, with a median C-index of 0.75 and log-rank p-value of 6e-7 in the testing datasets. The results are much more accurate than PAGE-Net, a CNN based complex model (median C-index of 0.68 and log-rank p-value of 0.03). Imaging features present additional predictive information to gene expression features, as the combined model is much more accurate than the model with gene expression alone (median C-index 0.70). Pathological image features are modestly correlated with gene expression. Genes having correlations to top imaging features have known associations with HCC patient survival and morphogenesis of liver tissue. Conclusion: This work provides two-stage Cox-nnet, a new class of biologically relevant and relatively interpretable models, to integrate multi-modal and multiple types of data for survival prediction.

中文翻译:

使用组织病理学和转录组学数据的肝癌预后预测的两阶段生物学可解释神经网络模型

目的:病理图像是易于获得的数据,具有作为预后生物标志物的潜力。此外,整合来自多种模式的异类数据类型,例如病理图像和基因表达数据,对于帮助预测癌症患者的生存是非常宝贵的。然而,分析挑战是巨大的。实验设计:在这里,我们采用CellProfiler提取的肝细胞癌(HCC)病理图像特征,并将其用作基于神经网络的预后评估Cox-nnet的输入。我们将该模型与常规Cox-PH模型,CoxBoost,Random Survival Forests和DeepSurv进行了比较,使用了HCC测试样本上的C指数和对数排名的p值。此外,为了整合同一患者的病理图像和基因表达数据,我们创新地构建了一个两阶段Cox-nnet模型,并将其与另一个复杂的神经网络模型PAGE-Net进行比较。结果:使用Cox-nnet进行的基于病理图像的预后预测比Cox-PH和随机生存森林模型准确得多,并且与CoxBoost和DeepSurv方法可比。此外,结合组织病理学图像和转录组学RNA-Seq数据的两阶段Cox-nnet复杂模型可实现更好的预后预测,在测试数据集中,中位C指数为0.75,对数秩p值为6e-7。结果比基于CNN的复杂模型PAGE-Net(C指数中位数为0.68,对数秩p值为0.03)要准确得多。成像特征为基因表达特征提供了额外的预测信息,因为组合模型比仅具有基因表达的模型准确得多(中位C指数0.70)。病理图像特征与基因表达适度相关。与顶部影像学特征相关的基因与肝癌患者的生存和肝组织的形态发生有已知的关联。结论:这项工作提供了两阶段的Cox-nnet,这是一类新的生物学上相关且相对可解释的模型,用于集成多模态和多种类型的数据以进行生存预测。
更新日期:2021-02-21
down
wechat
bug