当前位置: X-MOL 学术Struct. Multidisc. Optim. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Optimal SVM parameter selection for non-separable and unbalanced datasets.
Structural and Multidisciplinary Optimization ( IF 3.9 ) Pub Date : 2014-10-01 , DOI: 10.1007/s00158-014-1105-z
Peng Jiang 1 , Samy Missoum 1 , Zhao Chen 2
Affiliation  

This article presents a study of three validation metrics used for the selection of optimal parameters of a support vector machine (SVM) classifier in the case of non-separable and unbalanced datasets. This situation is often encountered when the data is obtained experimentally or clinically. The three metrics selected in this work are the area under the ROC curve (AUC), accuracy, and balanced accuracy. These validation metrics are tested using computational data only, which enables the creation of fully separable sets of data. This way, non-separable datasets, representative of a real-world problem, can be created by projection onto a lower dimensional sub-space. The knowledge of the separable dataset, unknown in real-world problems, provides a reference to compare the three validation metrics using a quantity referred to as the "weighted likelihood". As an application example, the study investigates a classification model for hip fracture prediction. The data is obtained from a parameterized finite element model of a femur. The performance of the various validation metrics is studied for several levels of separability, ratios of unbalance, and training set sizes.

中文翻译:

不可分离和不平衡数据集的最佳 SVM 参数选择。

本文介绍了三个验证指标的研究,这些指标用于在不可分离和不平衡数据集的情况下选择支持向量机 (SVM) 分类器的最佳参数。这种情况在通过实验或临床获得数据时经常遇到。在这项工作中选择的三个指标是 ROC 曲线下面积 (AUC)、准确度和平衡准确度。这些验证指标仅使用计算数据进行测试,从而能够创建完全可分离的数据集。这样,代表现实世界问题的不可分离数据集可以通过投影到低维子空间来创建。在实际问题中未知的可分离数据集的知识为使用称为“
更新日期:2019-11-01
down
wechat
bug