Uncertainty quantification for multilabel text classification,WIREs Data Mining and Knowledge Discovery

当前位置： X-MOL 学术 › WIREs Data Mining Knowl. Discov. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Uncertainty quantification for multilabel text classification
WIREs Data Mining and Knowledge Discovery ( IF 7.8 ) Pub Date : 2020-08-24 , DOI: 10.1002/widm.1384
Wenshi Chen ₁ , Bowen Zhang ₁ , Mingyu Lu ₁

Affiliation

Deep neural networks have recently achieved impressive performance on multilabel text classification. However, the uncertainty in multilabel text classification tasks and their application in the model are often overlooked. To better understand and evaluate the uncertainty in multilabel text classification tasks, we propose a general framework called Uncertainty Quantification for Multilabel Text Classification framework. Based on the prediction results produced by traditional neural networks, the aleatory uncertainty of each classification label and the epistemic uncertainty of the prediction result can further be obtained by this framework. We design experiments to characterize the properties of aleatory uncertainty and epistemic uncertainty from the data characteristics and model features. The experimental results show that this framework is reasonable. Furthermore, we demonstrate how this framework allows us to define the model optimization criterion to identify policies that balance the expected training cost, model performance, and uncertainty sensitivity.

中文翻译：

多标签文本分类的不确定度量化

深度神经网络最近在多标签文本分类上取得了令人印象深刻的性能。但是，经常会忽略多标签文本分类任务的不确定性及其在模型中的应用。为了更好地理解和评估多标签文本分类任务中的不确定性，我们提出了一个通用框架，称为多标签文本分类的不确定性量化框架。基于传统神经网络产生的预测结果，通过该框架可以进一步获得每个分类标签的偶然不确定性和预测结果的认知不确定性。我们设计实验以从数据特征和模型特征来表征偶然不确定性和认知不确定性的特性。实验结果表明该框架是合理的。此外，我们演示了此框架如何使我们能够定义模型优化标准，以识别可平衡预期培训成本，模型性能和不确定性敏感性的策略。

更新日期：2020-10-17

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>