Expert Systems with Applications ( IF 7.5 ) Pub Date : 2020-03-20 , DOI: 10.1016/j.eswa.2020.113380 Benjamin Denham , Russel Pears , M. Asif Naeem
The sensitive nature of many data streams necessitates data mining techniques that are privacy-preserving. This paper proposes two data perturbation methods for privacy-preserving stream mining based on a combination of random projection, random translation, and two alternative forms of additive noise: noise generated independently for each record and noise that accumulates over the lifetime of a stream. Variations of the known input-output Maximum A Posteriori (MAP) attack that can account for the combinations of perturbation techniques are proposed as a means of evaluating the privacy guarantees of the proposed perturbation methods. The capabilities of the proposed methods to resist privacy-breaching recovery attacks and retain accuracy in models trained on perturbed data are experimentally evaluated. Experimentation revealed that the cumulative noise injection scheme outperformed other schemes by achieving a superior trade-off between privacy and classification.
中文翻译:
利用独立和累积的加性噪声增强随机投影,以保护隐私数据流
许多数据流的敏感本质要求必须采用保护隐私的数据挖掘技术。本文基于随机投影,随机平移和两种附加形式的加性噪声的组合,提出了两种用于隐私保护流挖掘的数据扰动方法:为每个记录独立生成的噪声和在流的整个生命周期中累积的噪声。提出了可以解释摄动技术组合的已知输入输出最大后验(MAP)攻击的变体,作为评估所提议摄动方法的隐私保证的一种手段。实验评估了所提出的方法抵制侵犯隐私的恢复攻击并在受扰动数据训练的模型中保持准确性的能力。