当前位置: X-MOL 学术Pattern Anal. Applic. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Cost-sensitive sample shifting in feature space
Pattern Analysis and Applications ( IF 3.7 ) Pub Date : 2020-06-17 , DOI: 10.1007/s10044-020-00890-9
Zhenchong Zhao , Xiaodan Wang , Chongming Wu , Lei Lei

The asymmetry of different misclassification costs is a common problem in many realistic applications. As one of the most familiar preprocessing methods, cost-sensitive resampling has drawn great attention due to its easy-implemented and universal properties. However, current methods mainly concentrate on changing the amount of the training set, which will alter the original distribution shapes and lead to the classifiers be over-fitted or unstable. For this case, a new method named cost-sensitive kernel shifting is proposed. The training data are remapped from the input space to the feature space by a particular kernel function, in which a distance metric is defined. Then the outliers are eliminated and the informative samples, including border and edge samples are selected due to the neighbor and geometrical information in the mapped space. Thirdly the positions of all the selected samples in the feature space are shifted. A moving step length is defined in proportion to both the ratio and different of the misclassification costs. In all steps only the kernel matrix is needed to be reshaped due to the kernel trick. Experiments on both synthetic and public datasets verify the effectiveness of the proposed methods.

中文翻译:

功能空间中成本敏感的样本转移

在许多实际应用中,不同分类错误成本的不对称性是一个普遍的问题。作为最熟悉的预处理方法之一,成本敏感的重采样由于其易于实现且通用的特性而备受关注。然而,当前的方法主要集中在改变训练集的数量上,这将改变原始的分布形状并导致分类器过度拟合或不稳定。针对这种情况,提出了一种新的方法,称为成本敏感型内核移位。训练数据通过特定的内核函数从输入空间重新映射到特征空间,其中定义了距离度量。然后,消除异常值,并根据映射空间中的邻居和几何信息选择包括边界和边缘样本在内的信息样本。第三,将所有选定样本在特征空间中的位置移动。移动步长与比率和错误分类成本的不同成比例地定义。在所有步骤中,由于内核技巧,仅需要对内核矩阵进行重塑。综合数据集和公共数据集上的实验验证了所提出方法的有效性。
更新日期:2020-06-17
down
wechat
bug