Robust twin bounded support vector machines for outliers and imbalanced data,Applied Intelligence

当前位置： X-MOL 学术 › Appl. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Robust twin bounded support vector machines for outliers and imbalanced data
Applied Intelligence ( IF 3.4 ) Pub Date : 2021-01-08 , DOI: 10.1007/s10489-020-01847-5
Parashjyoti Borah , Deepak Gupta

Truncated loss functions are robust to class noise and outliers. A robust twin bounded support vector machine is proposed in this paper that truncates the growth of its loss functions at a pre-specified point, thus, flattens the function that pre-specified score afterwards. Moreover, to make the proposed method capable of handling datasets with different imbalance ratio, cost-sensitive learning is implemented by scaling the total error of the classes based on the number of samples of each class. However, the adopted loss functions take a non-convex structure which does not always assure global optimum. To handle this issue, we suggest concave-convex procedure (CCCP) to ensure global convergence by decomposing the cost functions into additions of one convex and one concave part. The behaviour of the proposed method for varied imbalance ratio is analysed experimentally and depicted graphically. Classification performance of the proposed method in terms of AUC, F-Measure and G-Mean is compared with other related methods, viz. Hinge loss support vector machine (SVM), Ramp loss SVM (RSVM), twin SVM (TWSVM), twin bounded SVM (TBSVM), Pinball loss SVM (pin-SVM), entropy-based fuzzy SVM (EFSVM), non-parallel hyperplane Universum SVM (U-NHSVM), stochastic gradient twin support vector machine (SGTSVM), k-nearest neighbor (KNN)-based maximum margin and minimum volume hyper-sphere machine (KNN-M3VHM) and affinity and class probability-based fuzzy SVM (ACFSVM) on several real-world datasets with imbalance ratio raging from low to high. Further, to establish the significance of the proposed method in pattern classification, pair-wise statistical comparison of the methods is performed based on their average ranks on AUC. Experimental results are convincing.

中文翻译：

健壮的双边界支持向量机，用于离群值和不平衡数据

截断的损失函数对于分类噪声和离群值具有鲁棒性。本文提出了一种鲁棒的孪生有界支持向量机，该机在预定点处截断了其损失函数的增长，从而展平了随后预定分数的函数。此外，为了使所提出的方法能够处理具有不同失衡比的数据集，通过基于每个类别的样本数缩放类别的总误差来实现成本敏感型学习。但是，采用的损失函数采用非凸结构，这并不总是确保全局最优。为了解决这个问题，我们建议使用凹凸过程（CCCP），通过将成本函数分解为一个凸部分和一个凹部分的加法来确保全局收敛。实验分析了所提出方法在不平衡比变化时的行为，并用图形表示。将所提出的方法在AUC，F-Measure和G-Mean方面的分类性能与其他相关方法进行了比较，即。铰链损耗支持向量机（SVM），斜坡损耗SVM（RSVM），孪生SVM（TWSVM），孪生有界SVM（TBSVM），弹球损耗SVM（pin-SVM），基于熵的模糊SVM（EFSVM），非并行超平面Universum SVM（U-NHSVM），随机梯度孪生支持向量机（SGTSVM），基于k最近邻（KNN）的最大余量和最小体积超球面机（KNN-M3VHM）以及基于亲和度和类别概率的模糊多个真实数据集上的SVM（ACFSVM），不平衡率从低到高。此外，要确定该方法在模式分类中的意义，基于方法在AUC上的平均排名对方法进行成对统计比较。实验结果令人信服。

更新日期：2021-01-10

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11