当前位置: X-MOL 学术Comput. Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Optimization scheme for intrusion detection scheme GBDT in edge computing center
Computer Communications ( IF 6 ) Pub Date : 2020-12-10 , DOI: 10.1016/j.comcom.2020.12.007
Ju-fu Cui , Hui Xia , Rui Zhang , Ben-xu Hu , Xiang-guo Cheng

Combination of edge computing technologies and machine learning help to put edge intelligence into practice. Industrial Internet of Things (IIoT) is one of its most typical applications. But this system can be easily attacked in the process of using edge computing center to process localized perception data. Intrusion detection technologies based on machine learning provide strong security for edge computing center, in which the most widely used is gradient boosting decision tree (i.e., GBDT). But still this model faces with problems such as imbalanced data, high dimensional data characteristics, and low efficiency of parameter optimization. To solve these problems, this paper proposes an optimization scheme for GBDT to improve its detection precision and training efficiency. First, to solve the problem of imbalanced data in data set, we propose a margin synthetic minority oversampling technique (i.e., MSMOTE), which can expand the non-noise data with less sample size, namely, small sample, to ensure equilibrium distribution of data. Second, to lower the data feature dimensionality, we propose a recursive feature elimination-hierarchy cross validation algorithm (i.e., RFE-HCV). The new algorithm eliminates redundant data features recursively according to feature weight, to strengthen the relationship between features and goals. It also designs hierarchy system to ensure equal proportionment of data category (attack category) in training set and testing set at cross validation stage. Next, in order to improve the efficiency of parameter optimization in model training process, we develop a flexible grid search algorithm (i.e., FGS) to improve retrieval efficiency of optimum parameters. Finally, the detailed experimental results show that our new scheme ensures data balance in dataset and eliminates redundant data features, and helps the efficiency of parameter optimization increase by three times. Moreover, the new scheme defends against intrusion more effectively.



中文翻译:

边缘计算中心入侵​​检测方案GBDT的优化方案

边缘计算技术与机器学习的结合有助于将边缘智能付诸实践。工业物联网(IIoT)是其最典型的应用之一。但是在使用边缘计算中心处理本地感知数据的过程中,该系统很容易受到攻击。基于机器学习的入侵检测技术为边缘计算中心提供了强大的安全性,其中最广泛使用的是梯度提升决策树(即GBDT)。但是该模型仍然面临诸如数据不平衡,高维数据特征和参数优化效率低之类的问题。为了解决这些问题,本文提出了一种GBDT的优化方案,以提高其检测精度和训练效率。首先,要解决数据集中数据不平衡的问题,我们提出了一种边际合成少数过采样技术(即MSMOTE),该技术可以以较小的样本量(即小样本)扩展非噪声数据,以确保数据的均衡分布。其次,为了降低数据特征维数,我们提出了一种递归特征消除-层次交叉验证算法(即RFE-HCV)。新算法根据特征权重递归消除冗余数据特征,以增强特征与目标之间的关系。它还设计了层次结构系统,以确保交叉验证阶段训练集和测试集中数据类别(攻击类别)的比例相等。接下来,为了提高模型训练过程中参数优化的效率,我们开发了一种灵活的网格搜索算法(即 FGS)以提高最佳参数的检索效率。最后,详细的实验结果表明,我们的新方案可确保数据集中的数据平衡并消除冗余数据特征,并使参数优化效率提高三倍。此外,新方案可以更有效地防御入侵。

更新日期:2020-12-10
down
wechat
bug