当前位置: X-MOL 学术Inform. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
OILog: An online incremental log keyword extraction approach based on MDP-LSTM neural network
Information Systems ( IF 3.7 ) Pub Date : 2020-08-14 , DOI: 10.1016/j.is.2020.101618
Xiaoyu Duan , Shi Ying , Hailong Cheng , Wanli Yuan , Xiang Yin

Log keyword extraction is an indispensable part of log anomaly detection. There are two main challenges in keyword extraction, one is that the essence of logs is unstructured, and different vendors usually define different log formats, the other one is that the most of the traditional method cannot update the log keywords incrementally to match the newly generated log data, so the extraction accuracy is low. To solve these problems, we introduce an online incremental keyword extraction method OILog. The essential idea of this method is that log templates are usually the longest combination of high-frequency words. OILog builds models by using a deep Long Short-Term Memory network (LSTM) for capturing both high-frequency log keywords in real-time and new log keywords generated by the system, which can transform unstructured raw logs into structured logs quickly. To improve the efficiency and accuracy of the model, we proposed an improved particle swarm optimization algorithm, which changes the traditional topology structure of Particle Swarm Optimization algorithm (PSO) into a multilayer structure and applies a new particle velocity update formula to increase the attraction between particles. We summarized the previous works and validated OILog using real log data collected from four systems. The results show that OILog has superiority in terms of both accuracy and robustness.



中文翻译:

OILog:一种基于MDP-LSTM神经网络的在线增量日志关键字提取方法

日志关键字提取是日志异常检测必不可少的部分。关键字提取存在两个主要挑战,一个是日志的本质是非结构化的,并且不同的供应商通常定义不同的日志格式,另一个是大多数传统方法无法增量更新日志关键字以匹配新生成的日志关键字。记录数据,因此提取精度较低。为了解决这些问题,我们介绍了一种在线增量关键字提取方法OILog。此方法的基本思想是日志模板通常是高频单词的最长组合。OILog通过使用深层的长期短期记忆网络(LSTM)建立模型,以实时捕获高频日志关键字和系统生成的新日志关键字,可以将非结构化原始日志快速转换为结构化日志。为了提高模型的效率和准确性,我们提出了一种改进的粒子群优化算法,将传统的粒子群优化算法(PSO)的拓扑结构变为多层结构,并应用新的粒子速度更新公式来增加模型之间的吸引力。粒子。我们总结了先前的工作,并使用从四个系统收集的真实日志数据验证了OILog。结果表明,OILog在准确性和鲁棒性方面均具有优势。该算法将传统的粒子群优化算法(PSO)拓扑结构更改为多层结构,并应用了新的粒子速度更新公式来增加粒子之间的吸引力。我们总结了先前的工作,并使用从四个系统收集的真实日志数据验证了OILog。结果表明,OILog在准确性和鲁棒性方面均具有优势。该算法将传统的粒子群优化算法(PSO)拓扑结构更改为多层结构,并应用了新的粒子速度更新公式来增加粒子之间的吸引力。我们总结了先前的工作,并使用从四个系统收集的真实日志数据验证了OILog。结果表明,OILog在准确性和鲁棒性方面均具有优势。

更新日期:2020-08-14
down
wechat
bug