当前位置: X-MOL 学术Mob. Inf. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Deep Random Forest Model on Spark for Network Intrusion Detection
Mobile Information Systems ( IF 1.863 ) Pub Date : 2020-12-22 , DOI: 10.1155/2020/6633252
Zhenpeng Liu 1, 2 , Nan Su 1 , Yiwen Qin 3 , Jiahuan Lu 1 , Xiaofei Li 2
Affiliation  

This paper focuses on an important research problem of cyberspace security. As an active defense technology, intrusion detection plays an important role in the field of network security. Traditional intrusion detection technologies have problems such as low accuracy, low detection efficiency, and time consuming. The shallow structure of machine learning has been unable to respond in time. To solve these problems, the deep learning-based method has been studied to improve intrusion detection. The advantage of deep learning is that it has a strong learning ability for features and can handle very complex data. Therefore, we propose a deep random forest-based network intrusion detection model. The first stage uses a slide window to segment original features into many small pieces and then trains a random forest to generate the concatenated class vector as rerepresentation. The vector will be used to train the multilevel cascade parallel random forest in the second stage. Finally, the classification of the original data is determined by voting strategy after the last layer of cascade. Meanwhile, the model is deployed in Spark environment and optimizes cache replacement strategy of RDDs by efficiency sorting and partition integrity check. The experiment results indicate that the proposed method can effectively detect anomaly network behaviors, with high F1-measure scores and high accuracy. The results also show that it can cut down the average execution time on different scaled clusters.

中文翻译:

基于Spark的深度随机森林模型用于网络入侵检测

本文重点关注网络空间安全的重要研究问题。入侵检测作为一种主动防御技术,在网络安全领域中发挥着重要作用。传统的入侵检测技术存在诸如准确性低,检测效率低以及耗时等问题。机器学习的浅层结构无法及时响应。为了解决这些问题,已经研究了基于深度学习的方法来改善入侵检测。深度学习的优势在于,它具有强大的功能学习能力,并且可以处理非常复杂的数据。因此,我们提出了一种基于深度随机森林的网络入侵检测模型。第一阶段使用滑动窗口将原始特征分割为许多小块,然后训练随机森林以生成级联向量作为重新表示。该向量将在第二阶段用于训练多级级联并行随机森林。最后,在层叠的最后一层之后,通过投票策略确定原始数据的分类。同时,该模型部署在Spark环境中,并通过效率排序和分区完整性检查来优化RDD的缓存替换策略。实验结果表明,该方法可以有效地检测网络异常行为,具有较高的F1测度得分和较高的准确性。结果还表明,它可以减少不同规模集群上的平均执行时间。该向量将在第二阶段用于训练多级级联并行随机森林。最后,在层叠的最后一层之后,通过投票策略确定原始数据的分类。同时,该模型部署在Spark环境中,并通过效率排序和分区完整性检查来优化RDD的缓存替换策略。实验结果表明,该方法可以有效地检测网络异常行为,具有较高的F1测度得分和较高的准确性。结果还表明,它可以减少不同规模集群上的平均执行时间。该向量将在第二阶段用于训练多级级联并行随机森林。最后,在层叠的最后一层之后,通过投票策略确定原始数据的分类。同时,该模型部署在Spark环境中,并通过效率排序和分区完整性检查来优化RDD的缓存替换策略。实验结果表明,该方法可以有效地检测网络异常行为,具有较高的F1测度得分和较高的准确性。结果还表明,它可以减少不同规模集群上的平均执行时间。该模型部署在Spark环境中,并通过效率排序和分区完整性检查来优化RDD的缓存替换策略。实验结果表明,该方法可以有效地检测网络异常行为,具有较高的F1测度得分和较高的准确性。结果还表明,它可以减少不同规模集群上的平均执行时间。该模型部署在Spark环境中,并通过效率排序和分区完整性检查来优化RDD的缓存替换策略。实验结果表明,该方法可以有效地检测网络异常行为,具有较高的F1测度得分和较高的准确性。结果还表明,它可以减少不同规模集群上的平均执行时间。
更新日期:2020-12-22
down
wechat
bug