当前位置: X-MOL 学术IEEE Trans. Inform. Theory › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
On Sparse Linear Regression in the Local Differential Privacy Model
IEEE Transactions on Information Theory ( IF 2.5 ) Pub Date : 2021-02-01 , DOI: 10.1109/tit.2020.3040406
Di Wang , Jinhui Xu

In this paper, we study the sparse linear regression problem under the Local Differential Privacy (LDP) model. We first show that polynomial dependency on the dimensionality $p$ of the space is unavoidable for the estimation error in both non-interactive and sequential interactive local models, if the privacy of the whole dataset needs to be preserved. Similar limitations also exist for other types of error measurements and in the relaxed local models. This indicates that differential privacy in high dimensional space is unlikely achievable for the problem. With the understanding of this limitation, we then present two algorithmic results. The first one is a sequential interactive LDP algorithm for the low dimensional sparse case, called Locally Differentially Private Iterative Hard Thresholding (LDP-IHT), which achieves a near optimal upper bound. This algorithm is actually rather general and can be used to solve quite a few other problems, such as (Local) DP-ERM with sparsity constraints and sparse regression with non-linear measurements. The second one is for the restricted (high dimensional) case where only the privacy of the responses (labels) needs to be preserved. For this case, we show that the optimal rate of the error estimation can be made logarithmically dependent on $p$ (i.e., $\log p$ ) in the local model, where an upper bound is obtained by a label-privacy version of LDP-IHT. Experiments on real world and synthetic datasets confirm our theoretical analysis.

中文翻译:

局部差分隐私模型中的稀疏线性回归

在本文中,我们研究了局部差分隐私(LDP)模型下的稀疏线性回归问题。我们首先证明多项式对维度的依赖 $p$ 如果需要保护整个数据集的隐私,那么对于非交互式和顺序交互式局部模型中的估计误差来说,空间的大小是不可避免的。其他类型的误差测量和宽松的局部模型也存在类似的限制。这表明该问题不太可能实现高维空间中的差分隐私。了解了这一限制后,我们将展示两个算法结果。第一个是用于低维稀疏情况的顺序交互 LDP 算法,称为局部差分私有迭代硬阈值 (LDP-IHT),它实现了接近最优的上限。这个算法其实比较通用,可以用来解决不少其他问题,例如具有稀疏约束的(本地)DP-ERM 和具有非线性测量的稀疏回归。第二个是针对受限(高维)情况,其中只需要保留响应(标签)的隐私。对于这种情况,我们表明错误估计的最佳率可以对数依赖于 $p$ (IE, $\log p$ ) 在本地模型中,其中上限由 LDP-IHT 的标签隐私版本获得。在真实世界和合成数据集上的实验证实了我们的理论分析。
更新日期:2021-02-01
down
wechat
bug