当前位置: X-MOL 学术J. Ind. Inf. Integr. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Privacy-preserving deep learning algorithm for big personal data analysis
Journal of Industrial Information Integration ( IF 10.4 ) Pub Date : 2019-07-06 , DOI: 10.1016/j.jii.2019.07.002
Rasim M. Alguliyev , Ramiz M. Aliguliyev , Fargana J. Abdullayeva

For privacy-preserving analysing of big data, a deep learning method is proposed. The method transforms the sensitive part of the personal information into non-sensitive data. To implement this process, two-stage architecture is proposed. The modified sparse denoising autoencoder and CNN models have been used in the architecture. Modified sparse denoising autoencoder performs transformation of data and CNN classifies the transformed data. In order to achieve low loss in data transformation, the sparsification parameter is added to the objective function of the autoencoder by the Kullback–Leibler divergence function. Here, the efficiency evaluation of the model is conducted by the MSE (mean squared error) loss function. In order to evaluate the accuracy of the transformation process, the features derived from the sparse denoising autoencoder algorithm fed to the input of the deep CNN algorithm and the classification of the reconstructed data is classified to the Black (0), White (1) and Gray (2) classes. Since here conducted the transformation of the Black class data to the Gray class data, in the classification stage, the CNN algorithm is classified the Black class data as the Gray class with 0.99 accuracy. The comparison of the proposed method with simple autoencoder is provided and experiments conducted on Cleveland medical dataset extracted from the Heart Disease dataset, Arrhythmia and Skoda datasets showed that the proposed method outperforms other conventional methods.



中文翻译:

隐私保护深度学习算法,用于大数据分析

为了保护大数据的隐私,提出了一种深度学习方法。该方法将个人信息的敏感部分转换为非敏感数据。为了实现此过程,提出了两阶段体系结构。改进的稀疏去噪自动编码器和CNN模型已在体系结构中使用。修改后的稀疏去噪自动编码器执行数据转换,而CNN对转换后的数据进行分类。为了实现数据转换中的低损失,通过Kullback-Leibler发散函数将稀疏化参数添加到自动编码器的目标函数中。在此,模型的效率评估是通过MSE(均方误差)损失函数进行的。为了评估转换过程的准确性,从稀疏去噪自动编码器算法得到的特征被馈送到深度CNN算法的输入,并将重构数据的分类分为黑色(0),白色(1)和灰色(2)类。由于此处将Black类数据转换为Gray类数据,因此在分类阶段,CNN算法将Black类数据分类为Gray类,精度为0.99。将该方法与简单的自动编码器进行了比较,对从心脏病数据集,心律失常和斯柯达数据集提取的克利夫兰医学数据集进行的实验表明,该方法优于其他常规方法。白色(1)和灰色(2)类。由于此处进行了将Black类数据转换为Gray类数据的操作,因此在分类阶段,CNN算法将Black类数据分类为Gray类,精度为0.99。将该方法与简单的自动编码器进行了比较,对从心脏病数据集,心律失常和斯柯达数据集提取的克利夫兰医学数据集进行的实验表明,该方法优于其他常规方法。白色(1)和灰色(2)类。由于此处进行了将Black类数据转换为Gray类数据的操作,因此在分类阶段,CNN算法将Black类数据分类为Gray类,精度为0.99。将该方法与简单的自动编码器进行了比较,对从心脏病数据集,心律失常和斯柯达数据集提取的克利夫兰医学数据集进行的实验表明,该方法优于其他常规方法。

更新日期:2019-07-06
down
wechat
bug