Outlier Detection Using Structural Scores in a High-Dimensional Space,IEEE Transactions on Cybernetics

当前位置： X-MOL 学术 › IEEE Trans. Cybern. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Outlier Detection Using Structural Scores in a High-Dimensional Space
IEEE Transactions on Cybernetics ( IF 9.4 ) Pub Date : 11-7-2018 , DOI: 10.1109/tcyb.2018.2876615
Xiaojie Li , Jiancheng Lv , Zhang Yi

Outlier detection has drawn significant interest from both academia and industry, such as network intrusion detection. Most existing methods implicitly or explicitly rely on distances in Euclidean space. However, the Euclidean distance may be incapable of measuring the similarity among high-dimensional data due to the curse of dimensionality, thus leading to inferior performance in practice. This paper presents an innovative approach for outlier detection from the view of meaningful structure scores. If two points have similar features, the difference between their structural scores is small and vice versa. The scores are calculated by measuring the variance of angles weighted by data representation, which takes the global data structure into the measurement. Thus, it could consistently rank more similar points. Compared with existing methods, our structural scores could be better to reflect the characteristics of data in a high-dimensional space. The proposed method consistently ranks more similar points. Experiments on synthetic and several real-world datasets have demonstrated the effectiveness and efficiency of our proposed methods.

中文翻译：

在高维空间中使用结构分数进行异常值检测

异常值检测引起了学术界和工业界的极大兴趣，例如网络入侵检测。大多数现有方法隐式或显式依赖于欧几里得空间中的距离。然而，由于维数灾难，欧氏距离可能无法衡量高维数据之间的相似性，从而导致实际性能较差。本文从有意义的结构分数的角度提出了一种创新的异常值检测方法。如果两个点具有相似的特征，则它们的结构得分之间的差异很小，反之亦然。分数是通过测量数据表示加权的角度方差来计算的，它将全局数据结构纳入测量中。因此，它可以一致地对更多相似点进行排名。与现有方法相比，我们的结构分数可以更好地反映高维空间中数据的特征。所提出的方法一致地对更多相似点进行排名。对合成数据集和几个真实数据集的实验证明了我们提出的方法的有效性和效率。

更新日期：2024-08-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11