当前位置: X-MOL 学术Pancreatology › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Explainable cancer factors discovery: Shapley additive explanation for machine learning models demonstrates the best practices in the case of pancreatic cancer
Pancreatology ( IF 3.6 ) Pub Date : 2024-02-06 , DOI: 10.1016/j.pan.2024.02.002
Liuyan Su , Alphonse Houssou Hounye , Qi Pan , Kexin Miao , Jiaoju Wang , Muzhou Hou , Li Xiong

Pancreatic cancer is one of digestive tract cancers with high mortality rate. Despite the wide range of available treatments and improvements in surgery, chemotherapy, and radiation therapy, the five-year prognosis for individuals diagnosed pancreatic cancer remains poor. There is still research to be done to see if immunotherapy may be used to treat pancreatic cancer. The goals of our research were to comprehend the tumor microenvironment of pancreatic cancer, found a useful biomarker to assess the prognosis of patients, and investigated its biological relevance. In this paper, machine learning methods such as random forest were fused with weighted gene co-expression networks for screening hub immune-related genes (hub-IRGs). LASSO regression model was used to further work. Thus, we got eight hub-IRGs. Based on hub-IRGs, we created a prognosis risk prediction model for PAAD that can stratify accurately and produce a prognostic risk score (IRG_Score) for each patient. In the raw data set and the validation data set, the five-year area under the curve (AUC) for this model was 0.9 and 0.7, respectively. And shapley additive explanation (SHAP) portrayed the importance of prognostic risk prediction influencing factors from a machine learning perspective to obtain the most influential certain gene (or clinical factor). The five most important factors were TRIM67, CORT, PSPN, SCAMP5, RFXAP, all of which are genes. In summary, the eight hub-IRGs had accurate risk prediction performance and biological significance, which was validated in other cancers. The result of SHAP helped to understand the molecular mechanism of pancreatic cancer.

中文翻译:

可解释的癌症因素发现:机器学习模型的 Shapley 附加解释展示了胰腺癌的最佳实践

胰腺癌是死亡率较高的消化道癌症之一。尽管手术、化疗和放射治疗方面有多种可用的治疗方法和改进,但诊断出胰腺癌的患者的五年预后仍然很差。仍有待进行研究以确定免疫疗法是否可用于治疗胰腺癌。我们研究的目标是了解胰腺癌的肿瘤微环境,找到一种有用的生物标志物来评估患者的预后,并研究其生物学相关性。在本文中,随机森林等机器学习方法与加权基因共表达网络融合,用于筛选中枢免疫相关基因(hub-IRG)。 LASSO回归模型用于进一步的工作。因此,我们有八个 hub-IRG。基于 hub-IRG,我们创建了 PAAD 预后风险预测模型,该模型可以准确分层并为每位患者生成预后风险评分 (IRG_Score)。在原始数据集和验证数据集中,该模型的五年曲线下面积 (AUC) 分别为 0.9 和 0.7。沙普利附加解释(SHAP)从机器学习的角度刻画了预后风险预测影响因素的重要性,以获得最有影响力的某个基因(或临床因素)。最重要的五个因子是TRIM67、CORT、PSPN、SCAMP5、RFXAP,它们都是基因。总之,八个 hub-IRG 具有准确的风险预测性能和生物学意义,并在其他癌症中得到了验证。 SHAP的结果有助于了解胰腺癌的分子机制。
更新日期:2024-02-06
down
wechat
bug