当前位置: X-MOL 学术J. Biomed. Inform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Deep learning uncertainty quantification for clinical text classification
Journal of Biomedical informatics ( IF 4.5 ) Pub Date : 2023-12-13 , DOI: 10.1016/j.jbi.2023.104576
Alina Peluso , Ioana Danciu , Hong-Jun Yoon , Jamaludin Mohd Yusof , Tanmoy Bhattacharya , Adam Spannaus , Noah Schaefferkoetter , Eric B. Durbin , Xiao-Cheng Wu , Antoinette Stroup , Jennifer Doherty , Stephen Schwartz , Charles Wiggins , Linda Coyle , Lynne Penberthy , Georgia D. Tourassi , Shang Gao

Introduction:

Machine learning algorithms are expected to work side-by-side with humans in decision-making pipelines. Thus, the ability of classifiers to make reliable decisions is of paramount importance. Deep neural networks (DNNs) represent the state-of-the-art models to address real-world classification. Although the strength of activation in DNNs is often correlated with the network’s confidence, in-depth analyses are needed to establish whether they are well calibrated.

Method:

In this paper, we demonstrate the use of DNN-based classification tools to benefit cancer registries by automating information extraction of disease at diagnosis and at surgery from electronic text pathology reports from the US National Cancer Institute (NCI) Surveillance, Epidemiology, and End Results (SEER) population-based cancer registries. In particular, we introduce multiple methods for selective classification to achieve a target level of accuracy on multiple classification tasks while minimizing the rejection amount—that is, the number of electronic pathology reports for which the model’s predictions are unreliable. We evaluate the proposed methods by comparing our approach with the current in-house deep learning-based abstaining classifier.

Results:

Overall, all the proposed selective classification methods effectively allow for achieving the targeted level of accuracy or higher in a trade-off analysis aimed to minimize the rejection rate. On in-distribution validation and holdout test data, with all the proposed methods, we achieve on all tasks the required target level of accuracy with a lower rejection rate than the deep abstaining classifier (DAC). Interpreting the results for the out-of-distribution test data is more complex; nevertheless, in this case as well, the rejection rate from the best among the proposed methods achieving 97% accuracy or higher is lower than the rejection rate based on the DAC.

Conclusions:

We show that although both approaches can flag those samples that should be manually reviewed and labeled by human annotators, the newly proposed methods retain a larger fraction and do so without retraining—thus offering a reduced computational cost compared with the in-house deep learning-based abstaining classifier.



中文翻译:


用于临床文本分类的深度学习不确定性量化


 介绍:


机器学习算法有望在决策流程中与人类并肩工作。因此,分类器做出可靠决策的能力至关重要。深度神经网络 (DNN) 代表了解决现实世界分类问题的最先进模型。尽管 DNN 中的激活强度通常与网络的置信度相关,但仍需要进行深入分析以确定它们是否经过良好校准。

 方法:


在本文中,我们演示了使用基于 DNN 的分类工具,通过从美国国家癌症研究所 (NCI) 监测、流行病学和最终结果的电子文本病理报告中自动提取诊断和手术时的疾病信息,使癌症登记处受益(SEER) 基于人群的癌症登记处。特别是,我们引入了多种选择性分类方法,以在多个分类任务上实现目标准确度水平,同时最大限度地减少拒绝量(即模型预测不可靠的电子病理报告的数量)。我们通过将我们的方法与当前内部基于深度学习的弃权分类器进行比较来评估所提出的方法。

 结果:


总体而言,所有提出的选择性分类方法都有效地允许在旨在最小化拒绝率的权衡分析中实现目标准确度或更高水平。在分布内验证和保留测试数据上,使用所有提出的方法,我们在所有任务上实现了所需的目标准确度水平,并且拒绝率低于深度弃权分类器(DAC)。解释分布外测试数据的结果更为复杂;然而,在这种情况下,实现 97% 或更高准确度的最佳方法的拒绝率也低于基于 DAC 的拒绝率。

 结论:


我们表明,虽然这两种方法都可以标记那些应该由人类注释者手动审查和标记的样本,但新提出的方法保留了更大的比例,并且不需要重新训练,从而与内部深度学习相比,降低了计算成本。基于弃权分类器。

更新日期:2023-12-15
down
wechat
bug