当前位置: X-MOL 学术IEEE Trans. Neural Netw. Learn. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An Uplink Communication-Efficient Approach to Featurewise Distributed Sparse Optimization With Differential Privacy
IEEE Transactions on Neural Networks and Learning Systems ( IF 10.4 ) Pub Date : 2020-09-17 , DOI: 10.1109/tnnls.2020.3020955
Jian Lou , Yiu-ming Cheung

In sparse empirical risk minimization (ERM) models, when sensitive personal data are used, e.g., genetic, healthcare, and financial data, it is crucial to preserve the differential privacy (DP) in training. In many applications, the information (i.e., features) of an individual is held by different organizations, which give rise to the prevalent yet challenging setting of the featurewise distributed multiparty model training. Such a setting is also beneficial to the scalability when the number of features exceeds the computation and storage capacity of a single node. However, existing private sparse optimizations are limited to centralized and samplewise distributed datasets only. In this article, we develop a differentially private algorithm for the sparse ERM model training under the featurewise distributed datasets setting. Our algorithm comes with guaranteed DP, nearly optimal utility, and reduced uplink communication complexity. Accordingly, we present a more generalized convergence analysis for block-coordinate Frank–Wolfe (BCFW) under arbitrary sampling (denoted as BCFW-AS in short), which significantly extends the known convergence results that apply to two specific sampling distributions only. To further reduce the uplink communication cost, we design an active private feature sharing scheme, which is new in both design and analysis of BCFW, to guarantee the convergence of communicating Johnson–Lindenstrauss transformed features. Empirical studies justify the new convergence as well as the nearly optimal utility theoretical results.

中文翻译:

一种具有差分隐私的特征分布式稀疏优化的上行链路高效通信方法

在稀疏的经验风险最小化 (ERM) 模型中,当使用敏感的个人数据时,例如遗传、医疗保健和财务数据,在训练中保护差异隐私 (DP) 至关重要。在许多应用程序中,个人的信息(即特征)由不同的组织持有,这导致了特征分布式多方模型训练的普遍但具有挑战性的设置。当特征数量超过单个节点的计算和存储容量时,这样的设置也有利于可扩展性。然而,现有的私有稀疏优化仅限于集中式和样本分布式数据集。在本文中,我们为特征分布式数据集设置下的稀疏 ERM 模型训练开发了一种差分私有算法。我们的算法具有保证的 DP、近乎最优的效用和降低的上行链路通信复杂性。因此,我们提出了一个更广义的块坐标 Frank-Wolfe (BCFW) 收敛分析任意采样(简称 BCFW-AS),它显着扩展了仅适用于两个特定采样分布的已知收敛结果。为了进一步降低上行通信成本,我们设计了一种主动私有特征共享方案,这在 BCFW 的设计和分析中都是新的,以保证通信 Johnson-Lindenstrauss 变换特征的收敛。实证研究证明了新的收敛性以及接近最优的效用理论结果。
更新日期:2020-09-17
down
wechat
bug