当前位置: X-MOL 学术arXiv.cs.CR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Training Production Language Models without Memorizing User Data
arXiv - CS - Cryptography and Security Pub Date : 2020-09-21 , DOI: arxiv-2009.10031
Swaroop Ramaswamy, Om Thakkar, Rajiv Mathews, Galen Andrew, H. Brendan McMahan, Fran\c{c}oise Beaufays

This paper presents the first consumer-scale next-word prediction (NWP) model trained with Federated Learning (FL) while leveraging the Differentially Private Federated Averaging (DP-FedAvg) technique. There has been prior work on building practical FL infrastructure, including work demonstrating the feasibility of training language models on mobile devices using such infrastructure. It has also been shown (in simulations on a public corpus) that it is possible to train NWP models with user-level differential privacy using the DP-FedAvg algorithm. Nevertheless, training production-quality NWP models with DP-FedAvg in a real-world production environment on a heterogeneous fleet of mobile phones requires addressing numerous challenges. For instance, the coordinating central server has to keep track of the devices available at the start of each round and sample devices uniformly at random from them, while ensuring \emph{secrecy of the sample}, etc. Unlike all prior privacy-focused FL work of which we are aware, for the first time we demonstrate the deployment of a differentially private mechanism for the training of a production neural network in FL, as well as the instrumentation of the production training infrastructure to perform an end-to-end empirical measurement of unintended memorization.

中文翻译:

在不记住用户数据的情况下训练生产语言模型

本文介绍了第一个使用联邦学习 (FL) 训练的消费者规模的下一个词预测 (NWP) 模型,同时利用差异私有联邦平均 (DP-FedAvg) 技术。之前已有构建实用 FL 基础设施的工作,包括展示使用此类基础设施在移动设备上训练语言模型的可行性的工作。还表明(在公共语料库的模拟中)可以使用 DP-FedAvg 算法训练具有用户级差异隐私的 NWP 模型。然而,在现实世界的生产环境中,在异构手机群上使用 DP-FedAvg 训练生产质量的 NWP 模型需要解决许多挑战。例如,
更新日期:2020-09-22
down
wechat
bug