当前位置: X-MOL 学术J. Big Data › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Performance Analysis of Intrusion Detection Systems Using a Feature Selection Method on the UNSW-NB15 Dataset
Journal of Big Data ( IF 8.6 ) Pub Date : 2020-11-25 , DOI: 10.1186/s40537-020-00379-6
Sydney M. Kasongo , Yanxia Sun

Computer networks intrusion detection systems (IDSs) and intrusion prevention systems (IPSs) are critical aspects that contribute to the success of an organization. Over the past years, IDSs and IPSs using different approaches have been developed and implemented to ensure that computer networks within enterprises are secure, reliable and available. In this paper, we focus on IDSs that are built using machine learning (ML) techniques. IDSs based on ML methods are effective and accurate in detecting networks attacks. However, the performance of these systems decreases for high dimensional data spaces. Therefore, it is crucial to implement an appropriate feature extraction method that can prune some of the features that do not possess a great impact in the classification process. Moreover, many of the ML based IDSs suffer from an increase in false positive rate and a low detection accuracy when the models are trained on highly imbalanced datasets. In this paper, we present an analysis the UNSW-NB15 intrusion detection dataset that will be used for training and testing our models. Moreover, we apply a filter-based feature reduction technique using the XGBoost algorithm. We then implement the following ML approaches using the reduced feature space: Support Vector Machine (SVM), k-Nearest-Neighbour (kNN), Logistic Regression (LR), Artificial Neural Network (ANN) and Decision Tree (DT). In our experiments, we considered both the binary and multiclass classification configurations. The results demonstrated that the XGBoost-based feature selection method allows for methods such as the DT to increase its test accuracy from 88.13 to 90.85% for the binary classification scheme.



中文翻译:

使用特征选择方法对UNSW-NB15数据集进行入侵检测系统的性能分析

计算机网络入侵检测系统(IDS)和入侵防御系统(IPSs)是有助于组织成功的关键方面。在过去的几年中,已经开发并实施了使用不同方法的IDS和IPS,以确保企业内的计算机网络安全,可靠且可用。在本文中,我们重点介绍使用机器学习(ML)技术构建的IDS。基于机器学习方法的入侵检测系统可以有效,准确地检测网络攻击。但是,对于高维数据空间,这些系统的性能会下降。因此,至关重要的是要实现一种适当的特征提取方法,该方法可以修剪一些在分类过程中影响不大的特征。此外,当在高度不平衡的数据集上训练模型时,许多基于ML的IDS会增加误报率和低检测精度。在本文中,我们对UNSW-NB15入侵检测数据集进行了分析,该数据集将用于训练和测试我们的模型。此外,我们使用XGBoost算法应用了基于过滤器的特征约简技术。然后,我们使用缩减的特征空间实现以下ML方法:支持向量机(SVM),k最近邻(kNN),逻辑回归(LR),人工神经网络(ANN)和决策树(DT)。在我们的实验中,我们同时考虑了二进制和多类分类配置。结果表明,基于XGBoost的特征选择方法允许DT等方法将其测试精度从88.13提高到90。

更新日期:2020-11-25
down
wechat
bug