当前位置: X-MOL 学术Eng. Appl. Artif. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A multiclass classification using one-versus-all approach with the differential partition sampling ensemble
Engineering Applications of Artificial Intelligence ( IF 8 ) Pub Date : 2020-11-01 , DOI: 10.1016/j.engappai.2020.104034
Xin Gao , Yang He , Mi Zhang , Xinping Diao , Xiao Jing , Bing Ren , Weijia Ji

The One-versus-all(OVA) approach is one of the mainstream decomposition methods by which multiple binary classifiers are used to solve multiclass classification tasks. However, it exists the problems of serious class imbalance. This paper proposes a differential partition sampling ensemble method(DPSE) in the OVA framework. The number of majority samples and that of the minority samples in each binary training dataset are used as the upper and lower limits of the sampling interval respectively. Within this range, the construction process of the arithmetic sequence is simulated to generate the set containing multiple different sampling numbers with equal intervals. All samples are divided into safe examples, borderline examples, rare examples, and outliers according to the neighborhood information, then Random undersampling for safe samples(s-Random undersampling) and SMOTE for borderline examples and rare examples (br-SMOTE) are proposed based on the distribution characteristics of the classes. In each iteration, according to the number of differential sampling, the two methods are used to undersample or oversample the majority and minority in each binary training dataset to balance the number of positive and negative samples, which preserves the characteristic of the class structure as much as possible. Balanced training sets are used to train the binary classification model with multiple sub classifiers. The thorough experiments performed on 27 KEEL public multiclass datasets show that DPSE outperforms the typical methods in the OVA scheme, the One-versus-One scheme or direct way in classification performance.



中文翻译:

使用一对多方法和差分分区采样集合的多类分类

单一所有人(OVA)方法是主流分解方法之一,通过该方法,多个二进制分类器用于解决多类分类任务。但是,存在严重的阶级失衡的问题。本文提出了一种在OVA框架下的差分分区采样集成方法(DPSE)。每个二进制训练数据集中的多数采样数和少数采样数分别用作采样间隔的上限和下限。在此范围内,模拟算术序列的构造过程以生成包含具有相等间隔的多个不同采样数的集合。根据邻域信息,将所有样本分为安全示例,边界示例,稀有示例和离群值,然后根据类别的分布特征,提出了针对安全样本的随机欠采样(s-Random欠采样)和针对边缘样本和稀有样本的SMOTE(br-SMOTE)。在每次迭代中,根据差分采样的数量,使用两种方法对每个二进制训练数据集中的多数和少数进行欠采样或过采样,以平衡正样本和负样本的数量,从而尽可能保留了类结构的特征。尽可能。平衡训练集用于训练具有多个子分类器的二进制分类模型。对27个KEEL公共多类数据集进行的彻底实验表明,DPSE在分类性能方面优于OVA方案,一对一方案或直接方法中的典型方法。

更新日期:2020-11-02
down
wechat
bug