当前位置: X-MOL 学术Appl. Soft Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Numerical sensitive data recognition based on hybrid gene expression programming for active distribution networks
Applied Soft Computing ( IF 8.7 ) Pub Date : 2020-03-13 , DOI: 10.1016/j.asoc.2020.106213
Song Deng , Xiangpeng Xie , Changan Yuan , Lechan Yang , Xindong Wu

Complex and flexible access mode, and frequent data interaction bring about large security risks to data transmission for active distribution networks. How to ensure data security is critical to the safe and stable operation of active distribution networks. Traditional methods, like access control, data encryption, and text filtering based on intelligent algorithms, are difficult to ensure the security of dynamically increased and high-dimensional numerical data transmission in active distribution networks. In this paper, we first propose a rough feature selection algorithm based on the average importance measurement (RFS-AIM) to simplify the complexity of data recognition. Then, we propose a sensitive data recognition function mining algorithm based on RFS-AIM and improved gene expression programming (SDR-IGEP) where population update operation is constructed by chromosome similarity based on the Jaccard coefficient. The operation avoids local convergence of the gene express programming by increasing individual diversity in the new population. Finally, we present a new incremental mining algorithm for a sensitive data recognition function based on global function fitting (ISDR-GFF) by using a grain granulation model for incremental datasets. The experimental results on IEEE benchmark datasets and real datasets show that the algorithms proposed in this paper outperform the state-of-the-art algorithms in terms of the average running time, precision, recall, F1 index, accuracy, specificity and speedup on all experimental datasets.



复杂而灵活的访问模式以及频繁的数据交互为有源配电网络的数据传输带来了巨大的安全风险。如何确保数据安全性对于有源配电网络的安全稳定运行至关重要。传统方法,如访问控制,数据加密和基于智能算法的文本过滤,很难确保主动分配网络中动态增加的高维数值数据传输的安全性。在本文中,我们首先提出一种基于平均重要性度量(RFS-AIM)的粗糙特征选择算法,以简化数据识别的复杂性。然后,我们提出了一种基于RFS-AIM和改进的基因表达编程(SDR-IGEP)的敏感数据识别函数挖掘算法,其中基于Jaccard系数通过染色体相似性构建种群更新操作。该操作通过增加新群体中的个体多样性避免了基因表达程序的局部收敛。最后,我们针对增量数据集使用了粒化模型,提出了一种基于全局函数拟合(ISDR-GFF)的敏感数据识别函数的新增量挖掘算法。在IEEE基准数据集和真实数据集上的实验结果表明,本文提出的算法在平均运行时间,精度,召回率,我们通过对增量数据集使用颗粒化模型,提出了一种基于全局函数拟合(ISDR-GFF)的敏感数据识别函数的新增量挖掘算法。在IEEE基准数据集和真实数据集上的实验结果表明,本文提出的算法在平均运行时间,精度,召回率,我们通过对增量数据集使用颗粒化模型,提出了一种基于全局函数拟合(ISDR-GFF)的敏感数据识别函数的新增量挖掘算法。在IEEE基准数据集和真实数据集上的实验结果表明,本文提出的算法在平均运行时间,精度,召回率,F1个 所有实验数据集的索引,准确性,特异性和加速性。
