当前位置: X-MOL 学术Electronics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Improvement of the Classification Performance of an Intrusion Detection Model for Rare and Unknown Attack Traffic
Electronics ( IF 2.9 ) Pub Date : 2021-09-15 , DOI: 10.3390/electronics10182268
Sangsoo Han , Youngwon Kim , Soojin Lee

How to deal with rare and unknown data in traffic classification has a decisive influence on classification performance. Rare data make it difficult to generate validation datasets to prevent overfitting, and unknown data interferes with learning and degrades the performance of the model. This paper presents a model generation method that accurately classifies rare data and new types of attacks, and does not result in overfitting. First, we use oversampling methods to solve the data imbalance caused by rare data. We separate the test dataset into a training dataset and a validation dataset. A model is created using separate training and validation datasets. Furthermore, the test dataset is used only for evaluating the performance capabilities of classification models, in order to make the test dataset independent of learning. We also use a softmax function that numerically indicates the probability that the model’s predictive results are accurate in detecting new, unknown attacks. Consequently, when applying the proposed method to the NSL_KDD dataset, the accuracy is 91.66%—an improvement of 6–16% compared to existing methods.

中文翻译:

针对罕见和未知攻击流量的入侵检测模型分类性能的改进

如何处理流量分类中的稀有和未知数据对分类性能有着决定性的影响。稀有数据使得生成验证数据集以防止过拟合变得困难,而未知数据会干扰学习并降低模型的性能。本文提出了一种模型生成方法,可以准确地对稀有数据和新型攻击进行分类,并且不会导致过拟合。首先,我们使用过采样的方法来解决稀有数据导致的数据不平衡问题。我们将测试数据集分为训练数据集和验证数据集。模型是使用单独的训练和验证数据集创建的。此外,测试数据集仅用于评估分类模型的性能能力,以使测试数据集独立于学习。我们还使用了一个 softmax 函数,该函数用数字表示模型的预测结果在检测新的未知攻击时准确的概率。因此,当将所提出的方法应用于 NSL_KDD 数据集时,准确率为 91.66%——与现有方法相比提高了 6-16%。
更新日期:2021-09-15
down
wechat
bug