当前位置: X-MOL 学术Data Min. Knowl. Discov. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Efficient set-valued prediction in multi-class classification
Data Mining and Knowledge Discovery ( IF 2.8 ) Pub Date : 2021-05-06 , DOI: 10.1007/s10618-021-00751-x
Thomas Mortier , Marek Wydmuch , Krzysztof Dembczyński , Eyke Hüllermeier , Willem Waegeman

In cases of uncertainty, a multi-class classifier preferably returns a set of candidate classes instead of predicting a single class label with little guarantee. More precisely, the classifier should strive for an optimal balance between the correctness (the true class is among the candidates) and the precision (the candidates are not too many) of its prediction. We formalize this problem within a general decision-theoretic framework that unifies most of the existing work in this area. In this framework, uncertainty is quantified in terms of conditional class probabilities, and the quality of a predicted set is measured in terms of a utility function. We then address the problem of finding the Bayes-optimal prediction, i.e., the subset of class labels with the highest expected utility. For this problem, which is computationally challenging as there are exponentially (in the number of classes) many predictions to choose from, we propose efficient algorithms that can be applied to a broad family of utility functions. Our theoretical results are complemented by experimental studies, in which we analyze the proposed algorithms in terms of predictive accuracy and runtime efficiency.



中文翻译:

多类别分类中的有效集值预测

在不确定的情况下,多类别分类器优选地返回一组候选类别,而不是在几乎没有保证的情况下预测单个类别标签。更准确地说,分类器应努力在其预测的正确性(真实类别在候选者中)和准确性(候选者不太多)之间寻求最佳平衡。我们在一个统一的决策理论框架内将此问题形式化,该框架统一了该领域中的大多数现有工作。在此框架中,不确定性通过条件类概率进行量化,预测集的质量通过效用函数进行度量。然后,我们解决找到贝叶斯最佳预测的问题,即具有最高期望效用的类别标签的子集。对于这个问题 这在计算上具有挑战性,因为有成千上万的预测(在类数中)可供选择,我们提出了可应用于广泛的效用函数族的有效算法。我们的理论结果得到了实验研究的补充,在实验研究中,我们从预测精度和运行时效率方面对提出的算法进行了分析。

更新日期:2021-05-06
down
wechat
bug