Attack sample generation algorithm based on data association group by GAN in industrial control dataset,Computer Communications

当前位置： X-MOL 学术 › Comput. Commun. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Attack sample generation algorithm based on data association group by GAN in industrial control dataset
Computer Communications ( IF 6 ) Pub Date : 2021-04-16 , DOI: 10.1016/j.comcom.2021.04.014
Wen Zhou , Xiang-min Kong , Kai-li Li , Xiao-ming Li , Lin-lin Ren , Yong Yan , Yun Sha , Xue-ying Cao , Xue-jun Liu

The importance of industrial control networks security is growing, but the intrusion detection research of industrial control networks is seriously restricted by the existing attack samples of the business dataset, especially the quantity and quality. In order to solve the problem of the scarcity of attack industrial control datasets, this paper proposes an attack sample generation algorithm. Firstly, based on the weight and degree of membership distribution, calculate the value of membership distance between dimensions, and the data association is strong when the membership distance of dimensions is small. Then, divide dimensions which have small distance into a group, so as to realize the association grouping of the original data. The data association of dimensions in an association group is strong when the association group appears frequently. According to the frequency of the association group, all the association groups are divided into strong association group and weak association group. Attack all the dimensions of one strong association group in the original data by false data injection attack, realized attack sample generation algorithm in the original data. Finally, expand the attack sample into a large amount of attack sample industrial control dataset by the Generative Adversarial Network. In this paper, the attack samples are generated by the BATADAL dataset and the business dataset of an oil depot, and the data is expanded by 100 times through the algorithm. Compared with the attack samples provided by the BATADAL dataset, the coincidence degree and fitting degree of generated data is improved by 38.20%–42.94% and 98.22%–98.36%, respectively. The classification results of XGBoost and SVM are 100% and 98.01%, which is close to the classification result of attack samples provided by BATADAL dataset.

中文翻译：

基于GAN数据关联组的工控数据集攻击样本生成算法。

工业控制网络安全的重要性日益增长，但是工业控制网络的入侵检测研究受到业务数据集现有攻击样本（尤其是数量和质量）的严重限制。为了解决攻击工业控制数据集稀缺的问题，提出了一种攻击样本生成算法。首先，基于隶属度分布的权重和程度，计算维度之间的隶属度距离值，当维度的隶属度较小时，数据关联性强。然后，将距离较小的维划分为一组，以实现原始数据的关联分组。当关联组频繁出现时，关联组中维的数据关联会很强。根据关联组的出现频率，将所有关联组分为强关联组和弱关联组。通过虚假数据注入攻击对原始数据中一个强关联组的所有维度进行攻击，实现了对原始数据的攻击样本生成算法。最后，通过对抗性生成网络将攻击样本扩展为大量攻击样本工业控制数据集。本文利用BATADAL数据集和油库业务数据集生成攻击样本，并通过该算法将数据扩展100倍。与BATADAL数据集提供的攻击样本相比，生成数据的重合度和拟合度分别提高了38.20％–42.94％和98.22％–98.36％。

更新日期：2021-04-16

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>