当前位置: X-MOL 学术Expert Syst. Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
BIDI: A classification algorithm with instance difficulty invariance
Expert Systems with Applications ( IF 8.5 ) Pub Date : 2020-09-02 , DOI: 10.1016/j.eswa.2020.113920
Shuang Yu , Xiongfei Li , Hancheng Wang , Xiaoli Zhang , Shiping Chen

In artificial intelligence, an expert/intelligent systems can emulate the decision-making ability of human experts. A good classification algorithm can provide significant assistance to expert/intelligent systems in solving a variety of practical problems. In classification, the “hard” instances may be outliers or noisy instances that are difficult to learn, which may confuse the classifier and induce the overfitting problem in the case of placing much emphasis on them. In fact, the difficulty of instances is crucial for improving the generalization and credibility of classification. Unfortunately, nearly all the existing classifiers ignore this important information. In this paper, the classification difficulty of each instance is introduced from a statistical perspective, which is an inherent characteristic of the instance itself. Then, a new classification algorithm named “boosting with instance difficulty invariance (BIDI)” is proposed by incorporating the classification difficulty of instances. The BIDI conforms to the human cognition that easy instances are misclassified with a lower probability than difficult ones, and performs better with respect to generalization. The key insight of BIDI can provide relevant guidance for researchers to improve the generalization and credibility of classifiers in the expert systems of decision support systems. Experimental results demonstrate the effectiveness of BIDI in real-world data sets, indicating that it has great potential for solving many classification tasks of expert systems such as disease diagnosis and credit card fraud detection. Although the classification difficulty has strong statistical significance, its implementation remains computationally expensive. A fast method demonstrating rationality and feasibility is also proposed to approximate instances’ classification difficulty.



中文翻译:

BIDI:具有实例难度不变性的分类算法

在人工智能中,专家/智能系统可以模仿人类专家的决策能力。良好的分类算法可以为专家/智能系统解决各种实际问题提供重要帮助。在分类中,“硬”实例可能是难以学习的离群值或嘈杂实例,这可能会使分类器产生混淆,并在过于强调它们的情况下引发过拟合问题。实际上,实例的难度对于提高分类的泛化性和可信度至关重要。不幸的是,几乎所有现有的分类器都忽略了这一重要信息。本文从统计的角度介绍了每个实例的分类难度,这是实例本身的固有特征。然后,通过结合实例的分类难度,提出了一种新的分类算法,称为“实例困难不变性提升(BIDI)”。BIDI符合人类的认知,即容易实例被误分类的可能性比难实例低,并且在泛化方面表现更好。BIDI的关键见识可以为研究人员提供相关指导,以提高决策支持系统专家系统中分类器的通用性和可信度。实验结果证明了BIDI在现实世界数据集中的有效性,表明它在解决专家系统的许多分类任务(如疾病诊断和信用卡欺诈检测)方面具有巨大的潜力。尽管分类难度具有很强的统计意义,它的实现在计算上仍然很昂贵。提出了一种证明合理性和可行性的快速方法来近似实例的分类难度。

更新日期:2020-09-02
down
wechat
bug