当前位置: X-MOL 学术IEEE Trans. Cybern. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Outlier Detection Using Structural Scores in a High-Dimensional Space.
IEEE Transactions on Cybernetics ( IF 11.8 ) Pub Date : 2018-11-07 , DOI: 10.1109/tcyb.2018.2876615
Xiaojie Li , Jiancheng Lv , Zhang Yi

Outlier detection has drawn significant interest from both academia and industry, such as network intrusion detection. Most existing methods implicitly or explicitly rely on distances in Euclidean space. However, the Euclidean distance may be incapable of measuring the similarity among high-dimensional data due to the curse of dimensionality, thus leading to inferior performance in practice. This paper presents an innovative approach for outlier detection from the view of meaningful structure scores. If two points have similar features, the difference between their structural scores is small and vice versa. The scores are calculated by measuring the variance of angles weighted by data representation, which takes the global data structure into the measurement. Thus, it could consistently rank more similar points. Compared with existing methods, our structural scores could be better to reflect the characteristics of data in a high-dimensional space. The proposed method consistently ranks more similar points. Experiments on synthetic and several real-world datasets have demonstrated the effectiveness and efficiency of our proposed methods.

中文翻译:

在高维空间中使用结构分数进行异常值检测。

异常检测已引起学术界和行业的极大兴趣,例如网络入侵检测。大多数现有方法隐式或显式依赖于欧几里得空间中的距离。然而,由于维数的诅咒,欧几里德距离可能无法测量高维数据之间的相似性,从而导致在实践中性能较差。本文从有意义的结构得分的角度提出了一种用于异常值检测的创新方法。如果两个点具有相似的特征,则它们的结构得分之间的差异很小,反之亦然。通过测量由数据表示加权的角度方差来计算分数,这会将全局数据结构纳入测量范围。因此,它可以始终如一地排列更多相似的点。与现有方法相比,我们的结构得分可能会更好地反映高维空间中数据的特征。所提出的方法一致地对更多相似点进行排名。在合成数据集和一些实际数据集上进行的实验证明了我们提出的方法的有效性和效率。
更新日期:2020-04-22
down
wechat
bug