当前位置: X-MOL 学术arXiv.cs.CR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
TableGAN-MCA: Evaluating Membership Collisions of GAN-Synthesized Tabular Data Releasing
arXiv - CS - Cryptography and Security Pub Date : 2021-07-28 , DOI: arxiv-2107.13190
Aoting Hu, Renjie Xie, Zhigang Lu, Aiqun Hu, Minhui Xue

Generative Adversarial Networks (GAN)-synthesized table publishing lets people privately learn insights without access to the private table. However, existing studies on Membership Inference (MI) Attacks show promising results on disclosing membership of training datasets of GAN-synthesized tables. Different from those works focusing on discovering membership of a given data point, in this paper, we propose a novel Membership Collision Attack against GANs (TableGAN-MCA), which allows an adversary given only synthetic entries randomly sampled from a black-box generator to recover partial GAN training data. Namely, a GAN-synthesized table immune to state-of-the-art MI attacks is vulnerable to the TableGAN-MCA. The success of TableGAN-MCA is boosted by an observation that GAN-synthesized tables potentially collide with the training data of the generator. Our experimental evaluations on TableGAN-MCA have five main findings. First, TableGAN-MCA has a satisfying training data recovery rate on three commonly used real-world datasets against four generative models. Second, factors, including the size of GAN training data, GAN training epochs and the number of synthetic samples available to the adversary, are positively correlated to the success of TableGAN-MCA. Third, highly frequent data points have high risks of being recovered by TableGAN-MCA. Fourth, some unique data are exposed to unexpected high recovery risks in TableGAN-MCA, which may attribute to GAN's generalization. Fifth, as expected, differential privacy, without the consideration of the correlations between features, does not show commendable mitigation effect against the TableGAN-MCA. Finally, we propose two mitigation methods and show promising privacy and utility trade-offs when protecting against TableGAN-MCA.

中文翻译:

TableGAN-MCA:评估 GAN 合成表格数据发布的成员冲突

生成对抗网络 (GAN) 合成表发布让人们可以在不访问私有表的情况下私​​下学习见解。然而,现有的关于成员推理 (MI) 攻击的研究在公开 GAN 合成表的训练数据集的成员资格方面显示出有希望的结果。与那些专注于发现给定数据点的成员资格的工作不同,在本文中,我们提出了一种针对 GAN 的新型成员资格碰撞攻击(TableGAN-MCA),它允许对手只给定从黑盒生成器随机采样的合成条目恢复部分 GAN 训练数据。也就是说,不受最先进 MI 攻击的 GAN 合成表容易受到 TableGAN-MCA 的攻击。TableGAN-MCA 的成功得益于观察到 GAN 合成表可能与生成器的训练数据发生冲突。我们对 TableGAN-MCA 的实验评估有五个主要发现。首先,TableGAN-MCA 在三个常用的真实世界数据集上针对四个生成模型具有令人满意的训练数据恢复率。其次,包括 GAN 训练数据的大小、GAN 训练时期和对手可用的合成样本数量在内的因素与 TableGAN-MCA 的成功呈正相关。第三,高频数据点被TableGAN-MCA恢复的风险很高。第四,一些独特的数据在TableGAN-MCA中暴露出意外的高恢复风险,这可能归因于GAN的泛化。五、不出所料,差异化隐私,没有考虑特征之间的相关性,对 TableGAN-MCA 没有表现出值得称道的缓解效果。最后,我们提出了两种缓解方法,并在保护 TableGAN-MCA 时展示了有希望的隐私和效用权衡。
更新日期:2021-07-29
down
wechat
bug