当前位置: X-MOL 学术arXiv.cs.CR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Bypassing the Ambient Dimension: Private SGD with Gradient Subspace Identification
arXiv - CS - Cryptography and Security Pub Date : 2020-07-07 , DOI: arxiv-2007.03813
Yingxue Zhou, Zhiwei Steven Wu, Arindam Banerjee

Differentially private SGD (DP-SGD) is one of the most popular methods for solving differentially private empirical risk minimization (ERM). Due to its noisy perturbation on each gradient update, the error rate of DP-SGD scales with the ambient dimension $p$, the number of parameters in the model. Such dependence can be problematic for over-parameterized models where $p \gg n$, the number of training samples. Existing lower bounds on private ERM show that such dependence on $p$ is inevitable in the worst case. In this paper, we circumvent the dependence on the ambient dimension by leveraging a low-dimensional structure of gradient space in deep networks---that is, the stochastic gradients for deep nets usually stay in a low dimensional subspace in the training process. We propose Projected DP-SGD that performs noise reduction by projecting the noisy gradients to a low-dimensional subspace, which is given by the top gradient eigenspace on a small public dataset. We provide a general sample complexity analysis on the public dataset for the gradient subspace identification problem and demonstrate that under certain low-dimensional assumptions the public sample complexity only grows logarithmically in $p$. Finally, we provide a theoretical analysis and empirical evaluations to show that our method can substantially improve the accuracy of DP-SGD.

中文翻译:

绕过环境维度:具有梯度子空间识别的私有 SGD

差分私有 SGD (DP-SGD) 是解决差分私有经验风险最小化 (ERM) 的最流行方法之一。由于每次梯度更新都会产生噪声扰动,DP-SGD 的错误率与环境维度 $p$(模型中的参数数量)成比例。这种依赖对于过度参数化的模型来说可能是有问题的,其中 $p \gg n$,训练样本的数量。私人 ERM 的现有下限表明,在最坏的情况下,这种对 $p$ 的依赖是不可避免的。在本文中,我们通过利用深度网络中梯度空间的低维结构来规避对环境维度的依赖——即,深度网络的随机梯度在训练过程中通常停留在低维子空间中。我们提出了 Projected DP-SGD,它通过将噪声梯度投影到低维子空间来执行降噪,该子空间由小型公共数据集上的顶部梯度特征空间给出。我们为梯度子空间识别问题提供了公共数据集上的一般样本复杂度分析,并证明在某些低维假设下,公共样本复杂度仅以 $p$ 为单位呈对数增长。最后,我们提供了理论分析和实证评估,以表明我们的方法可以显着提高 DP-SGD 的准确性。我们为梯度子空间识别问题提供了公共数据集上的一般样本复杂度分析,并证明在某些低维假设下,公共样本复杂度仅以 $p$ 为单位呈对数增长。最后,我们提供了理论分析和实证评估,以表明我们的方法可以显着提高 DP-SGD 的准确性。我们为梯度子空间识别问题提供了公共数据集上的一般样本复杂度分析,并证明在某些低维假设下,公共样本复杂度仅以 $p$ 为单位呈对数增长。最后,我们提供了理论分析和实证评估,以表明我们的方法可以显着提高 DP-SGD 的准确性。
更新日期:2020-07-09
down
wechat
bug