当前位置: X-MOL 学术arXiv.cs.NA › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Tighter Theory for Local SGD on Identical and Heterogeneous Data
arXiv - CS - Numerical Analysis Pub Date : 2019-09-10 , DOI: arxiv-1909.04746
Ahmed Khaled and Konstantin Mishchenko and Peter Richt\'arik

We provide a new analysis of local SGD, removing unnecessary assumptions and elaborating on the difference between two data regimes: identical and heterogeneous. In both cases, we improve the existing theory and provide values of the optimal stepsize and optimal number of local iterations. Our bounds are based on a new notion of variance that is specific to local SGD methods with different data. The tightness of our results is guaranteed by recovering known statements when we plug $H=1$, where $H$ is the number of local steps. The empirical evidence further validates the severe impact of data heterogeneity on the performance of local SGD.

中文翻译:

相同和异构数据上局部 SGD 的更严格理论

我们提供了对本地 SGD 的新分析,删除了不必要的假设并详细说明了两种数据机制之间的差异:相同和异构。在这两种情况下,我们改进了现有理论并提供了最优步长和最优局部迭代次数的值。我们的界限基于一种新的方差概念,该概念特定于具有不同数据的局部 SGD 方法。当我们插入 $H=1$ 时,通过恢复已知语句来保证我们结果的紧密性,其中 $H$ 是局部步骤的数量。经验证据进一步验证了数据异质性对本地 SGD 性能的严重影响。
更新日期:2020-03-03
down
wechat
bug