Predicting Lung Cancers Using Epidemiological Data: A Generative-Discriminative Framework,IEEE/CAA Journal of Automatica Sinica

当前位置： X-MOL 学术 › IEEE/CAA J. Automatica Sinica › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Predicting Lung Cancers Using Epidemiological Data: A Generative-Discriminative Framework
IEEE/CAA Journal of Automatica Sinica ( IF 11.8 ) Pub Date : 2021-02-19 , DOI: 10.1109/jas.2021.1003910
Jinpeng Li ₁ , Yaling Tao ₁ , Ting Cai ₁

Affiliation

Predictive models for assessing the risk of developing lung cancers can help identify high-risk individuals with the aim of recommending further screening and early intervention. To facilitate pre-hospital self-assessments, some studies have exploited predictive models trained on non-clinical data (e.g., smoking status and family history). The performance of these models is limited due to not considering clinical data (e.g., blood test and medical imaging results). Deep learning has shown the potential in processing complex data that combine both clinical and non-clinical information. However, predicting lung cancers remains difficult due to the severe lack of positive samples among follow-ups. To tackle this problem, this paper presents a generative-discriminative framework for improving the ability of deep learning models to generalize. According to the proposed framework, two nonlinear generative models, one based on the generative adversarial network and another on the variational autoencoder, are used to synthesize auxiliary positive samples for the training set. Then, several discriminative models, including a deep neural network (DNN), are used to assess the lung cancer risk based on a comprehensive list of risk factors. The framework was evaluated on over 55 000 subjects questioned between January 2014 and December 2017, with 699 subjects being clinically diagnosed with lung cancer between January 2014 and August 2019. According to the results, the best performing predictive model built using the proposed framework was based on DNN. It achieved an average sensitivity of 76.54% and an area under the curve of 69.24% in distinguishing between the cases of lung cancer and normal cases on test sets.

中文翻译：

使用流行病学数据预测肺癌：生成-歧视框架

评估发展为肺癌的风险的预测模型可以帮助确定高危人群，并建议进一步筛查和早期干预。为了促进院前自我评估，一些研究利用了基于非临床数据（例如吸烟状况和家族史）训练的预测模型。由于未考虑临床数据（例如，血液测试和医学成像结果），因此这些模型的性能受到限制。深度学习显示了处理结合了临床和非临床信息的复杂数据的潜力。但是，由于在随访中严重缺乏阳性样本，因此预测肺癌仍然很困难。为了解决这个问题，本文提出了一种生成-判别框架，以提高深度学习模型的泛化能力。根据提出的框架，使用两个非线性生成模型，一个基于生成对抗网络，另一个基于变分自动编码器，以合成训练集的辅助正样本。然后，基于全面的风险因素列表，使用包括深度神经网络（DNN）在内的几种判别模型来评估肺癌风险。在2014年1月至2017年12月期间，对超过5.5万名被调查者进行了评估，其中在2014年1月至2019年8月之间有699名受试者被临床诊断出患有肺癌。在DNN上。它的平均灵敏度为76.54％，曲线下面积为69。

更新日期：2021-04-06

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>