当前位置: X-MOL 学术Adv. Data Anal. Classif. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A robust approach to model-based classification based on trimming and constraints
Advances in Data Analysis and Classification ( IF 1.6 ) Pub Date : 2019-08-14 , DOI: 10.1007/s11634-019-00371-w
Andrea Cappozzo , Francesca Greselin , Thomas Brendan Murphy

In a standard classification framework a set of trustworthy learning data are employed to build a decision rule, with the final aim of classifying unlabelled units belonging to the test set. Therefore, unreliable labelled observations, namely outliers and data with incorrect labels, can strongly undermine the classifier performance, especially if the training size is small. The present work introduces a robust modification to the Model-Based Classification framework, employing impartial trimming and constraints on the ratio between the maximum and the minimum eigenvalue of the group scatter matrices. The proposed method effectively handles noise presence in both response and exploratory variables, providing reliable classification even when dealing with contaminated datasets. A robust information criterion is proposed for model selection. Experiments on real and simulated data, artificially adulterated, are provided to underline the benefits of the proposed method.

中文翻译:

基于修剪和约束的基于模型的分类的可靠方法

在标准分类框架中,使用一组可信赖的学习数据来构建决策规则,最终目的是对属于测试集的未标记单元进行分类。因此,不可靠的标记观测值(即离群值和带有错误标记的数据)会严重损害分类器的性能,尤其是在训练量较小的情况下。本工作介绍了对基于模型的分类框架的强大修改,对分组散布矩阵的最大特征值和最小特征值之间的比率采用了公正的修整和约束。所提出的方法有效地处理了响应变量和探索变量中的噪声,即使在处理受污染的数据集时也可以提供可靠的分类。提出了一种鲁棒的信息准则用于模型选择。
更新日期:2019-08-14
down
wechat
bug