当前位置:
X-MOL 学术
›
arXiv.cs.CR
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Training Production Language Models without Memorizing User Data
arXiv - CS - Cryptography and Security Pub Date : 2020-09-21 , DOI: arxiv-2009.10031 Swaroop Ramaswamy, Om Thakkar, Rajiv Mathews, Galen Andrew, H. Brendan McMahan, Fran\c{c}oise Beaufays
arXiv - CS - Cryptography and Security Pub Date : 2020-09-21 , DOI: arxiv-2009.10031 Swaroop Ramaswamy, Om Thakkar, Rajiv Mathews, Galen Andrew, H. Brendan McMahan, Fran\c{c}oise Beaufays
This paper presents the first consumer-scale next-word prediction (NWP) model
trained with Federated Learning (FL) while leveraging the Differentially
Private Federated Averaging (DP-FedAvg) technique. There has been prior work on
building practical FL infrastructure, including work demonstrating the
feasibility of training language models on mobile devices using such
infrastructure. It has also been shown (in simulations on a public corpus) that
it is possible to train NWP models with user-level differential privacy using
the DP-FedAvg algorithm. Nevertheless, training production-quality NWP models
with DP-FedAvg in a real-world production environment on a heterogeneous fleet
of mobile phones requires addressing numerous challenges. For instance, the
coordinating central server has to keep track of the devices available at the
start of each round and sample devices uniformly at random from them, while
ensuring \emph{secrecy of the sample}, etc. Unlike all prior privacy-focused FL
work of which we are aware, for the first time we demonstrate the deployment of
a differentially private mechanism for the training of a production neural
network in FL, as well as the instrumentation of the production training
infrastructure to perform an end-to-end empirical measurement of unintended
memorization.
中文翻译:
在不记住用户数据的情况下训练生产语言模型
本文介绍了第一个使用联邦学习 (FL) 训练的消费者规模的下一个词预测 (NWP) 模型,同时利用差异私有联邦平均 (DP-FedAvg) 技术。之前已有构建实用 FL 基础设施的工作,包括展示使用此类基础设施在移动设备上训练语言模型的可行性的工作。还表明(在公共语料库的模拟中)可以使用 DP-FedAvg 算法训练具有用户级差异隐私的 NWP 模型。然而,在现实世界的生产环境中,在异构手机群上使用 DP-FedAvg 训练生产质量的 NWP 模型需要解决许多挑战。例如,
更新日期:2020-09-22
中文翻译:
在不记住用户数据的情况下训练生产语言模型
本文介绍了第一个使用联邦学习 (FL) 训练的消费者规模的下一个词预测 (NWP) 模型,同时利用差异私有联邦平均 (DP-FedAvg) 技术。之前已有构建实用 FL 基础设施的工作,包括展示使用此类基础设施在移动设备上训练语言模型的可行性的工作。还表明(在公共语料库的模拟中)可以使用 DP-FedAvg 算法训练具有用户级差异隐私的 NWP 模型。然而,在现实世界的生产环境中,在异构手机群上使用 DP-FedAvg 训练生产质量的 NWP 模型需要解决许多挑战。例如,