当前位置: X-MOL 学术Commun. Stat. Simul. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Support vector machine and optimal parameter selection for high-dimensional imbalanced data
Communications in Statistics - Simulation and Computation ( IF 0.8 ) Pub Date : 2020-09-08 , DOI: 10.1080/03610918.2020.1813300
Yugo Nakayama 1
Affiliation  

Abstract

In this article, we consider asymptotic properties of support vector machine (SVM) in high-dimension, low-sample-size (HDLSS) settings. In particular, we treat high-dimensional imbalanced data. We investigate behaviors of SVM for a regularization parameter C in a framework of kernel functions. We show that SVM cannot handle imbalanced classification, and SVM is very biased in HDLSS settings. In order to overcome such difficulties, we propose robust SVM (RSVM), which gives excellent performances in HDLSS settings. We also give a pre-selection method for parameters included in a kernel function without cross-validation. Finally, we check the performance of RSVM and the optimality of the choice in numerical simulation and actual data analyses.



中文翻译:

高维不平衡数据的支持向量机与最优参数选择

摘要

在本文中,我们考虑了支持向量机 (SVM) 在高维、低样本量 (HDLSS) 设置中的渐近特性。特别是,我们处理高维不平衡数据。我们在核函数框架中研究 SVM 对正则化参数C的行为。我们表明 SVM 无法处理不平衡的分类,并且 SVM 在 HDLSS 设置中非常有偏差。为了克服这些困难,我们提出了鲁棒的支持向量机(RSVM),它在 HDLSS 设置中提供了出色的性能。我们还给出了一种无需交叉验证的核函数中包含的参数的预选方法。最后,我们检查了 RSVM 的性能以及数值模拟和实际数据分析中选择的最优性。

更新日期:2020-09-08
down
wechat
bug