当前位置: X-MOL 学术arXiv.cs.CR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An Efficient DP-SGD Mechanism for Large Scale NLP Models
arXiv - CS - Cryptography and Security Pub Date : 2021-07-14 , DOI: arxiv-2107.14586
Christophe Dupuy, Radhika Arava, Rahul Gupta, Anna Rumshisky

Recent advances in deep learning have drastically improved performance on many Natural Language Understanding (NLU) tasks. However, the data used to train NLU models may contain private information such as addresses or phone numbers, particularly when drawn from human subjects. It is desirable that underlying models do not expose private information contained in the training data. Differentially Private Stochastic Gradient Descent (DP-SGD) has been proposed as a mechanism to build privacy-preserving models. However, DP-SGD can be prohibitively slow to train. In this work, we propose a more efficient DP-SGD for training using a GPU infrastructure and apply it to fine-tuning models based on LSTM and transformer architectures. We report faster training times, alongside accuracy, theoretical privacy guarantees and success of Membership inference attacks for our models and observe that fine-tuning with proposed variant of DP-SGD can yield competitive models without significant degradation in training time and improvement in privacy protection. We also make observations such as looser theoretical $\epsilon, \delta$ can translate into significant practical privacy gains.

中文翻译:

一种用于大规模 NLP 模型的高效 DP-SGD 机制

深度学习的最新进展极大地提高了许多自然语言理解 (NLU) 任务的性能。但是,用于训练 NLU 模型的数据可能包含私人信息,例如地址或电话号码,尤其是从人类受试者中提取的数据。希望底层模型不暴露训练数据中包含的私人信息。差分私有随机梯度下降 (DP-SGD) 已被提议作为构建隐私保护模型的机制。然而,DP-SGD 的训练速度可能非常慢。在这项工作中,我们提出了一种更有效的 DP-SGD,用于使用 GPU 基础设施进行训练,并将其应用于基于 LSTM 和转换器架构的微调模型。我们报告了更快的训练时间和准确性,理论隐私保证和我们模型的成员推理攻击的成功,并观察到对提议的 DP-SGD 变体进行微调可以产生有竞争力的模型,而不会显着降低训练时间和改善隐私保护。我们还进行了观察,例如更宽松的理论 $\epsilon、\delta$ 可以转化为重要的实际隐私收益。
更新日期:2021-08-02
down
wechat
bug