当前位置: X-MOL 学术Inform. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An instance-oriented performance measure for classification
Information Sciences Pub Date : 2021-09-01 , DOI: 10.1016/j.ins.2021.08.094
Shuang Yu 1, 2, 3 , Xiongfei Li 1, 2 , Yuncong Feng 1, 4 , Xiaoli Zhang 1, 2 , Shiping Chen 3
Affiliation  

Performance evaluation is significant in data classification. The existing evaluation methods ignore the characteristics (such as classification difficulty) of each instance. In practice, it is necessary to measure classification performance from the perspective of instances. In this paper, an instance-oriented classification performance metric is proposed based on the classification difficulty of each instance, named degree of credibility (Cr ). Cr conforms to the natural cognition that the lower the probability of misclassifying relatively easy instances, the more credible the classifier. It focuses on the credibility of each instance’s prediction, which opens up a new way for classifier evaluation. Moreover, several important properties of Cr are identified, laying solid theoretical foundation for classifier evaluation. Also, the concept of acceptable classifier is proposed to judge whether the trained model and its parameter set reach excellent ranks at the current technology level instead of relying entirely on human experience. The experimental results of twelve classifiers on twelve datasets indicate the physical significance and good statistical consistency and discriminatory ability of Cr, as well as the feasibility of acceptable classifiers for model selection and training. Furthermore, the proposal of approximate difficulty greatly improves the computation efficiency of instance difficulty.



中文翻译:

面向实例的分类性能度量

性能评估在数据分类中具有重要意义。现有的评估方法忽略了每个实例的特征(如分类难度)。在实践中,需要从实例的角度来衡量分类性能。在本文中,基于每个实例的分类难度,提出了一种面向实例的分类性能度量,称为可信度Cr)。Cr符合自然认知,即错误分类相对容易的实例的概率越低,分类器越可信。它侧重于每个实例预测的可信度,这为分类器评估开辟了一条新途径。此外,Cr 的几个重要性质为分类器评价奠定了坚实的理论基础。此外,还提出了可接受分类器的概念,以判断训练的模型及其参数集是否达到当前技术水平的优秀等级,而不是完全依赖于人类经验。十二个分类器在十二个数据集上的实验结果表明Cr的物理意义和良好的统计一致性和判别能力,以及可接受的分类器用于模型选择和训练的可行性。此外,近似难度的提议大大提高了实例难度的计算效率。

更新日期:2021-09-10
down
wechat
bug