A multiclass classification using one-versus-all approach with the differential partition sampling ensemble
Introduction
Classification is one of the important problems in the field of data mining. It is widely used in various practical fields, such as sentiment classification (Catal and Nangir, 2017), fraud detection (Nami and Shajari, 2018, Triepels et al., 2018), and fault diagnosis (Islam and Kim, 2018). Usually, the binary classification problem involves two classes, and in the multiclass classification problem, the number of classes is greater than two. The multiclass classification problem is more complicated than the binary classification problem. The methods for dealing with multiclass classification problem are mainly divided into two groups. One is to expand the binary classifier into multiclass classifier through some strategies, and it includes some typical algorithms, such as support vector machine (de Lima et al., 2018), decision tree (Guan et al., 2017), Oblique decision tree ensemble (Zhang and Suganthan, 2014, Katuwal et al., 2020), XGBOOST (Chen and Guestrin, 2016) and deep neural network (Hosaka, 2019). The second is to divide the multiclass classification problem into multiple binary problems (binarization) (Zhou et al., 2017, Liu et al., 2017). The former directly uses one classifier to deal with multiclass classification tasks. It is easier to establish a classifier to distinguish two classes than to distinguish multiple classes, and the decision boundaries of two classes may be simpler than the multiclass decision boundaries (Krawczyk et al., 2018). In contrast, decomposing the original multiclass problem into several binary sub-problems is much easier, so the second method has attracted extensive attention in practical research (Zhang et al., 2016, Zhang et al., 2018, Li et al., 2020).
Two most popular approaches regarding binarization are one-versus-one (OVO) and one-versus-all (OVA) (Galar et al., 2011). The OVO approach divides a multiclass problem with classes into binary sub-problems, and each classifier in the OVO discriminates between a pair of classes . When a test pattern is classified by this scheme, the score matrix will be obtained by all the binary classifiers. Since each classifier only has the ability to classify the two classes it contains, when the instance’ s true label does not belong to both classes, the classifier will give invalid discrimination (Galar et al., 2013, Galar et al., 2015, Zhou and Fujita, 2017). The OVA approach divides a multiclass problem with classes into binary sub-problems, and each classifier treats one of the classes as the positive class, and all the other classes as the negative class. Compared with the OVO scheme, when the dataset contains more classes, the OVA approach deploys fewer resources or uses fewer classifiers (Zhou and Fujita, 2017, Sen et al., 2016). As the number of classifiers decreases, the outputs of all binary classifiers are more simple to aggregate. Additionally, there is no invalid classifier phenomenon in the framework. Each classifier only considers a certain class and all the other classes, which simplifies the problem. However, even if there is no class-imbalance among the classes, as the number of all the other classes is larger than that of the target class, it will cause the imbalance of positive and negative sample numbers, which affects the classification effect of each binary classifier (Sen et al., 2016, Zhang et al., 2016, Li et al., 2020). How to solve the imbalance caused by OVA will be of great significance to improve the classification effect.
In recent years, binary class imbalance problems have received widespread attention (Lin et al., 2017, Douzas et al., 2018, Collell et al., 2018). In fact, in the OVA framework, each sub-problem can be regarded as a binary classification problem. Solving the imbalanced data classification problem in the OVA framework is actually solving the binary class imbalance problems. For example, Li et al. (2020) uses the OVO decomposition strategy to divide multiclass classification problems into multiple binary classification problems and then uses the oversampling with spectral clustering to balance the data of each binary class imbalance problem to effectively reduce the impact of imbalance. Zhang et al. (2016) explores the effectiveness of binary class imbalanced learning methods UnderBagging (Barandela et al., 2003), SMOTEBagging (Díez-Pastor et al., 2015), RUSBoost (Seiffert et al., 2010), SMOTEBoost (Chawla et al., 2003), SMOTE + AdaBoost (Liu et al., 2009), and EasyEnsemble (Liu et al., 2009) to solve multiclass imbalanced classification problems in the OVO framework. Experimental research proves that combining decomposition strategy and ensemble learning can improve mining imbalanced multi-class problems. Compared with the single sampling method, ensemble learning combined with data preprocessing can not only balance the number of samples but also improve diversity (García et al., 2018). Therefore, our research focuses on using ensemble learning methods to solve the class imbalance problem in the OVA framework to improve the classification performance of OVA.
In this paper, we propose a multi-classification method for multiclass imbalanced data classification by combining the OVA decomposition strategy with ensemble learning. This method proposes the differential partition sampling ensemble (DPSE) to construct a binary classification model for each binary classification problem, which can overcome the problem of data imbalance caused by the OVA decomposition strategy. DPSE combining ensemble learning and sampling methods establishes the differentiated training datasets to increase the diversity of each binary classification model. Before iteration, unlike DTE-SBD in Sun et al. (2018), DPSE calculates the differential set which includes multiple gradually increasing sampling numbers by simulating the construction process of the arithmetic sequence. In order to improve the performance of the single sampling method, all samples are divided into safe examples, borderline examples, rare examples, and outliers according to the neighborhood information before data processing. Then Random undersampling for safe samples(s-Random undersampling) and SMOTE for borderline examples and rare examples (br-SMOTE) methods that consider the distribution characteristics of the classes are proposed to balance the number of minority samples and majority samples in each iteration.
To verify the effectiveness of the proposed method, the average of each class accuracy () (García et al., 2018) was used as the performance measure. 27 multiclass datasets from the KEEL dataset repository (Triguero et al., 2017) and three different base classifiers including CART (Gordon et al., 1984), Random forest (Liaw and Wiener, 2002), and SVM (Vapnic, 1998) were selected for thorough experimental research. In the OVA scheme, the differential partition sampling ensemble was compared with the typical imbalanced learning methods. Moreover, the proposed method was compared with three typical methods for solving multiclass imbalanced classification problems. The difference between the proposed method and other methods was also verified by the Friedman test and Holm–Bonferroni test (García et al., 2010).
The rest of this paper is organized as follows. We first describe the imbalanced learning problem and the solution in Section 2. Then, the approach proposed in this paper is introduced in Section 3. In Section 4, the experimental framework is presented in detail, as well as the results and discussion. Finally, the conclusions are given in Section 5.
Section snippets
Related work
In order to solve the data imbalance problem in the OVA scheme, we first introduce and analyze the existing imbalanced learning methods in Section 2.1. And then the OVA decomposition strategy is introduced in detail in Section 2.2.
The proposed method: the differential partition sampling ensemble in the OVA
The total number of original training samples corresponding to each classifier is the same in the OVA framework. Considering the effectiveness of the ensemble learning with preprocessing techniques in binary imbalance learning, the method is applied to each binary dataset to solve the data imbalance problem (Zhang et al., 2016). Undersampling and oversampling may have drawbacks when used in isolation, as a single approach is not suitable for all imbalanced datasets (Nanni et al., 2015). In
Experiment and evaluation
In this section, in order to verify the effectiveness of the proposed method, a comprehensive experimental analysis was performed on 27 KEEL public multiclass datasets using CART, Random forest, and SVM as the benchmark classifier respectively. The datasets chosen are described in Section 4.1, and the benchmark classifiers with their parameter settings are described in Section 4.2. The measure to evaluate the performance of the methods is presented in Section 4.3. In Section 4.4, the
Conclusion
Binarization is a common method to divide a multiclass classification problem into several binary problems. In this paper, the differential partition sampling ensemble in the OVA scheme is proposed to reduce the impact of data imbalance on overall classification performance. The differential set of the sampling number for each binary subset is determined in advance based on the idea of incremental arithmetic progression. Then multiple balanced training sets are generated by undersampling and
CRediT authorship contribution statement
Xin Gao: Conceptualization, Methodology, Writing - original draft, Supervision, Funding acquisition. Yang He: Conceptualization, Methodology, Software development, Validation, Writing - original draft. Mi Zhang: Methodology, Funding acquisition. Xinping Diao: Software development. Xiao Jing: Software development, Validation. Bing Ren: Experimental data preprocessing. Weijia Ji: Experimental data preprocessing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References (55)
- et al.
A sentiment classification model based on multiple classifiers
Appl. Soft Comput.
(2017) - et al.
A simple plug-in bagging ensemble based on threshold-moving for classifying binary and multiclass imbalanced data
Neurocomputing
(2018) - et al.
Improvements on least squares twin multi-class classification support vector machine
Neurocomputing
(2018) - et al.
Random balance: Ensembles of variable priors classifiers for imbalanced data
Knowl.-Based Syst.
(2015) - et al.
Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE
Inform. Sci.
(2018) - et al.
Analysing the classification of imbalanced data-sets with multiple classes: Binarization techniques and ad-hoc approaches
Knowl.-Based Syst.
(2013) - et al.
An overview of ensemble methods for binary classifiers in multi-class problems: Experimental study on one-vs-one and one-vs-all schemes
Pattern Recognit.
(2011) - et al.
Dynamic classifier selection for One-vs-One strategy: Avoiding non-competent classifiers
Pattern Recognit.
(2013) - et al.
DRCW-OVO: Distance-based relative competence weighting combination for one-vs-one strategy in multi-class problems
Pattern Recognit.
(2015) - et al.
Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power
Inform. Sci.
(2010)