当前位置: X-MOL 学术Pattern Recogn. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Mean-shift outlier detection and filtering
Pattern Recognition ( IF 8 ) Pub Date : 2021-02-08 , DOI: 10.1016/j.patcog.2021.107874
Jiawei Yang , Susanto Rahardja , Pasi Fränti

Traditional outlier detection methods create a model for data and then label as outliers for objects that deviate significantly from this model. However, when dat has many outliers, outliers also pollute the model. The model then becomes unreliable, thus rendering most outlier detectors to become ineffective. To solve this problem, we propose a mean-shift outlier detector. This detector employs a mean-shift technique to modify data and cancel the bias caused by the outliers. The mean-shift technique replaces every object by the mean of its k-nearest neighbors which essentially removes the effect of outliers before clustering without the need to know the outliers. In addition, it also detects outliers based on the distance shifted. Our experiments show that the proposed method works well regardless of the number of outliers in the data. This method outperforms all state-of-the-art methods tested, with both real-world numeric datasets as well as generated numeric and string datasets.



中文翻译:

均值漂移离群值检测和过滤

传统的离群值检测方法会为数据创建一个模型,然后将其标记为与该模型有很大差异的对象的离群值。但是,当dat有许多异常值时,异常值也会污染模型。然后,该模型变得不可靠,从而使大多数异常值检测器变得无效。为了解决这个问题,我们提出了一种均值漂移离群检测器。该检测器采用均值漂移技术来修改数据并消除由异常值引起的偏差。均值漂移技术用k的均值替换每个对象-最邻近的邻居,在聚类之前基本上不需要消除异常值的影响,而无需了解异常值。此外,它还根据偏移的距离检测异常值。我们的实验表明,无论数据中有多少异常值,该方法都能很好地工作。该方法在实际数值数据集以及生成的数值和字符串数据集方面均胜过所有测试的最新方法。

更新日期:2021-02-21
down
wechat
bug