当前位置: X-MOL 学术Sci. Program. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Log-Based Anomaly Detection Method with Efficient Neighbor Searching and Automatic K Neighbor Selection
Scientific Programming ( IF 1.672 ) Pub Date : 2020-06-02 , DOI: 10.1155/2020/4365356
Bingming Wang 1 , Shi Ying 1 , Zhe Yang 1
Affiliation  

Using the k-nearest neighbor (kNN) algorithm in the supervised learning method to detect anomalies can get more accurate results. However, when using kNN algorithm to detect anomaly, it is inefficient at finding k neighbors from large-scale log data; at the same time, log data are imbalanced in quantity, so it is a challenge to select proper k neighbors for different data distributions. In this paper, we propose a log-based anomaly detection method with efficient selection of neighbors and automatic selection of k neighbors. First, we propose a neighbor search method based on minhash and MVP-tree. The minhash algorithm is used to group similar logs into the same bucket, and MVP-tree model is built for samples in each bucket. In this way, we can reduce the effort of distance calculation and the number of neighbor samples that need to be compared, so as to improve the efficiency of finding neighbors. In the process of selecting k neighbors, we propose an automatic method based on the Silhouette Coefficient, which can select proper k neighbors to improve the accuracy of anomaly detection. Our method is verified on six different types of log data to prove its universality and feasibility.

中文翻译:

一种具有高效邻居搜索和自动K邻居选择的基于对数的异常检测方法

使用监督学习方法中的k-最近邻(kNN)算法来检测异常可以得到更准确的结果。但是,在使用kNN算法进行异常检测时,从大规模日志数据中寻找k个邻居效率低下;同时,日志数据在数量上是不平衡的,因此为不同的数据分布选择合适的k个邻居是一个挑战。在本文中,我们提出了一种基于日志的异常检测方法,具有有效的邻居选择和 k 个邻居的自动选择。首先,我们提出了一种基于 minhash 和 MVP-tree 的邻居搜索方法。minhash算法用于将相似的日志分组到同一个bucket中,每个bucket中的样本建立MVP-tree模型。这样,我们可以减少距离计算的工作量和需要比较的邻居样本的数量,从而提高寻找邻居的效率。在选择k个邻居的过程中,我们提出了一种基于Silhouette Coefficient的自动方法,可以选择合适的k个邻居来提高异常检测的准确性。我们的方法在六种不同类型的日志数据上进行了验证,以证明其通用性和可行性。
更新日期:2020-06-02
down
wechat
bug