当前位置: X-MOL 学术J. Supercomput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Network anomaly detection based on selective ensemble algorithm
The Journal of Supercomputing ( IF 2.5 ) Pub Date : 2020-07-07 , DOI: 10.1007/s11227-020-03374-z
Hongle Du , Yan Zhang

In order to reduce the loss of information of the majority class samples in the resampling process, combining the distribution of class samples and the characteristics of ensemble learning algorithm, in this paper, a two-level selective ensemble learning algorithm for imbalanced datasets is proposed. Firstly, the algorithm under-samples the majority class samples and constructs multiple training subsets. The training process will generate multiple base classifiers using AdaBoost algorithm, then select some base classifiers according to maximum correlation and minimum redundancy criteria, and form sub-classifiers according to weighted integration. Then, generate multiple sub-classifiers for multiple training subsets, and then, select some sub-classifiers according to maximum correlation and minimum redundancy criteria. Then, the weights of the selected sub-classifiers are calculated by F-means or G-means, and the ensemble classifier is obtained by weighted voting. Finally, the improved algorithm for imbalanced dataset is applied to the network anomaly detection. The experimental results on UCI datasets show that this method can improve the classification performance to a certain extent, especially for imbalanced datasets. Finally, the algorithm is applied to network anomaly detection for Internet of Things. From the simulation data of KDDCUP99 dataset, we can see that TLSE-ID algorithm has a small missing report rate and high precision.

中文翻译:

基于选择性集成算法的网络异常检测

为了减少重采样过程中多数类样本的信息丢失,结合类样本的分布和集成学习算法的特点,本文提出了一种针对不平衡数据集的两级选择性集成学习算法。首先,算法对多数类样本进行欠采样,构造多个训练子集。训练过程将使用 AdaBoost 算法生成多个基分类器,然后根据最大相关性和最小冗余标准选择一些基分类器,并根据加权整合形成子分类器。然后,为多个训练子集生成多个子分类器,然后根据最大相关和最小冗余标准选择一些子分类器。然后,通过F-means或G-means计算选出的子分类器的权重,通过加权投票得到集成分类器。最后,将改进的不平衡数据集算法应用于网络异常检测。在UCI数据集上的实验结果表明,该方法可以在一定程度上提高分类性能,特别是对于不平衡的数据集。最后将该算法应用于物联网的网络异常检测。从KDDCUP99数据集的仿真数据可以看出,TLSE-ID算法漏报率小,精度高。在UCI数据集上的实验结果表明,该方法可以在一定程度上提高分类性能,特别是对于不平衡的数据集。最后将该算法应用于物联网的网络异常检测。从KDDCUP99数据集的仿真数据可以看出,TLSE-ID算法漏报率小,精度高。在UCI数据集上的实验结果表明,该方法可以在一定程度上提高分类性能,特别是对于不平衡的数据集。最后将该算法应用于物联网的网络异常检测。从KDDCUP99数据集的仿真数据可以看出,TLSE-ID算法漏报率小,精度高。
更新日期:2020-07-07
down
wechat
bug