Collaborative deep learning across multiple data centers,Science China Information Sciences

当前位置： X-MOL 学术 › Sci. China Inf. Sci. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Collaborative deep learning across multiple data centers
Science China Information Sciences ( IF 8.8 ) Pub Date : 2020-07-14 , DOI: 10.1007/s11432-019-2705-2
Haibo Mi , Kele Xu , Dawei Feng , Huaimin Wang , Yiming Zhang , Zibin Zheng , Chuan Chen , Xu Lan

Valuable training data is often owned by independent organizations and located in multiple data centers. Most deep learning approaches require to centralize the multi-datacenter data for performance purpose. In practice, however, it is often infeasible to transfer all data of different organizations to a centralized data center owing to the constraints of privacy regulations. It is very challenging to conduct the geo-distributed deep learning among data centers without the privacy leaks. Model averaging is a conventional choice for data parallelized training and can reduce the risk of privacy leaks, but its ineffectiveness is claimed by previous studies as deep neural networks are often non-convex. In this paper, we argue that model averaging can be effective in the decentralized environment by using two strategies, namely, the cyclical learning rate (CLR) and the increased number of epochs for local model training. With the two strategies, we show that model averaging can provide competitive performance in the decentralized mode compared to the data-centralized one. In a practical environment with multiple data centers, we conduct extensive experiments using state-of-the-art deep network architectures on different types of data. Results demonstrate the effectiveness and robustness of the proposed method.

中文翻译：

跨多个数据中心的协作深度学习

有价值的培训数据通常由独立组织拥有，并位于多个数据中心中。大多数深度学习方法都需要集中多数据中心数据以达到性能目的。但是，实际上，由于隐私法规的限制，将不同组织的所有数据传输到集中式数据中心通常是不可行的。在没有隐私泄漏的情况下，在数据中心之间进行地理分布的深度学习非常具有挑战性。模型平均是数据并行训练的常规选择，可以减少隐私泄露的风险，但是由于深度神经网络通常是非凸的，因此先前的研究声称其无效。在本文中，我们认为，通过使用两种策略，模型平均可以在去中心化环境中有效。周期性学习率（CLR）和用于局部模型训练的时期数增加。通过这两种策略，我们显示出与数据集中式模式相比，模型平均可以在分散模式下提供竞争性能。在具有多个数据中心的实际环境中，我们对不同类型的数据使用最新的深度网络体系结构进行了广泛的实验。结果证明了该方法的有效性和鲁棒性。我们对不同类型的数据使用最新的深度网络架构进行了广泛的实验。结果证明了该方法的有效性和鲁棒性。我们使用最先进的深度网络体系结构对不同类型的数据进行了广泛的实验。结果证明了该方法的有效性和鲁棒性。

更新日期：2020-07-16

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文

全部期刊列表>>