当前位置: X-MOL 学术Pattern Anal. Applic. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
KAGO: an approximate adaptive grid-based outlier detection approach using kernel density estimate
Pattern Analysis and Applications ( IF 3.7 ) Pub Date : 2021-07-12 , DOI: 10.1007/s10044-021-00998-6
Panthadeep Bhattacharjee 1 , Pinaki Mitra 1 , Ankur Garg 2
Affiliation  

Outlier detection approaches show their efficacy while extracting unforeseen knowledge in domains such as intrusion detection, e-commerce, and fraudulent transactions. A prominent method like the K-Nearest Neighbor (KNN)-based outlier detection (KNNOD) technique relies on distance measures to extract the anomalies from the dataset. However, KNNOD is ill-equipped to deal with dynamic data environment efficiently due to its quadratic time complexity and sensitivity to changes in the dataset. As a result, any form of redundant computation due to frequent updates may lead to inefficiency while detecting outliers. In order to address these challenges, we propose an approximate adaptive grid-based outlier detection technique by finding point density using kernel density estimate (KAGO) instead of any distance measure. The proposed technique prunes the inlier grids and filters the candidate grids with local outliers upon a new point insertion. The grids containing potential outliers are aggregated to converge on to at most top-N global outliers incrementally. Experimental evaluation showed that KAGO outperformed KNNOD by more than an order of \(\approx\)3.9 across large relevant datasets at about half the memory consumption.



中文翻译:

KAGO:一种使用核密度估计的近似自适应网格离群点检测方法

异常值检测方法在提取入侵检测、电子商务和欺诈交易等领域中不可预见的知识时显示了其有效性。像基于 K-最近邻 (KNN) 的异常值检测 (KNNOD) 技术这样的突出方法依赖于距离度量来从数据集中提取异常。然而,由于 KNNOD 的二次时间复杂度和对数据集变化的敏感性,KNNOD 无法有效地处理动态数据环境。因此,由于频繁更新而导致的任何形式的冗余计算都可能导致检测异常值时效率低下。为了解决这些挑战,我们提出了一种近似daptive克RID-基于ö utlier检测通过使用查找点密度技术ķ ernel密度估计(KAGO)代替任何距离量度。所提出的技术在新点插入时修剪内部网格并过滤具有局部异常值的候选网格。包含潜在异常值的网格被聚合以逐渐收敛到最多前 ​​N 个全局异常值。实验评估表明,KAGO在大型相关数据集上的性能优于 KNNOD 超过\(\approx\) 3.9 的数量级,而内存消耗约为内存消耗的一半。

更新日期:2021-07-12
down
wechat
bug