Improving the Convergence of Distributed Gradient Descent via Inexact Average Consensus

Du, Bin; Zhou, Jiazhen; Sun, Dengfeng

doi:10.1007/s10957-020-01648-3

Improving the Convergence of Distributed Gradient Descent via Inexact Average Consensus

Published: 17 March 2020

Volume 185, pages 504–521, (2020)
Cite this article

Journal of Optimization Theory and Applications Aims and scope Submit manuscript

500 Accesses
1 Citation
Explore all metrics

Abstract

It is observed that the inexact convergence of the well-known distributed gradient descent algorithm can be caused by inaccuracy of consensus procedures. Motivated by this, to achieve the improved convergence, we ensure the sufficiently accurate consensus via approximate consensus steps. The accuracy is controlled by a predefined sequence of consensus error bounds. It is shown that one can achieve exact convergence when the sequence decays to zero; furthermore, a linear convergence rate can be obtained when the sequence decays to zero linearly. To implement the approximate consensus step with given error bounds, an inexact average consensus scheme is proposed in a distributed manner. Due to the flexibility of choices of both consensus error bounds and consensus schemes, the proposed framework offers the potential to balance the communication and computation in distributed optimization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hybrid ADMM: a unifying and fast approach to decentralized optimization

Article Open access 04 December 2018

Distributed Average Consensus in Digraphs

Distributed stochastic gradient tracking methods

Article 17 March 2020

References

Bazerque, J.A., Giannakis, G.B.: Distributed spectrum sensing for cognitive radio networks by exploiting sparsity. IEEE Trans. Signal Process. 58(3), 1847–1862 (2010)
Article MathSciNet Google Scholar
Schizas, I.D., Giannakis, G.B., Roumeliotis, S.I., Ribeiro, A.: Consensus in ad hoc WSNs with noisy links—part II: distributed estimation and smoothing of random signals. IEEE Trans. Signal Process. 56(4), 1650–1666 (2008)
Article MathSciNet Google Scholar
Cao, Y., Yu, W., Ren, W., Chen, G.: An overview of recent progress in the study of distributed multi-agent coordination. IEEE Trans. Ind. Inform. 9(1), 427–438 (2013)
Article Google Scholar
Bullo, F., Cortes, J., Martinez, S.: Distributed Control of Robotic Networks: A Mathematical Approach to Motion Coordination Algorithms, vol. 27. Princeton University Press, Princeton (2009)
Book Google Scholar
Cevher, V., Becker, S., Schmidt, M.: Convex optimization for big data: scalable, randomized, and parallel algorithms for big data analytics. IEEE Signal Process. Mag. 31(5), 32–43 (2014)
Article Google Scholar
Bekkerman, R., Bilenko, M., Langford, J.: Scaling Up Machine Learning: Parallel and Distributed Approaches. Cambridge University Press, Cambridge (2011)
Book Google Scholar
Nedic, A., Ozdaglar, A.: Distributed subgradient methods for multi-agent optimization. IEEE Trans. Autom. Control 1(54), 48–61 (2009)
Article MathSciNet Google Scholar
Tsianos, K.I., Rabbat, M.G.: Distributed strongly convex optimization. In: 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 593–600. IEEE (2012)
Nedic, A., Olshevsky, A., Shi, W.: Achieving geometric convergence for distributed optimization over time-varying graphs. SIAM J. Optim. 27(4), 2597–2633 (2017)
Article MathSciNet Google Scholar
Shi, W., Ling, Q., Wu, G., Yin, W.: Extra: an exact first-order algorithm for decentralized consensus optimization. SIAM J. Optim. 25(2), 944–966 (2015)
Article MathSciNet Google Scholar
Jakovetić, D.: A unification and generalization of exact distributed first-order methods. IEEE Trans. Signal Inf. Process. Netw. 5(1), 31–46 (2019)
Article MathSciNet Google Scholar
Shi, W., Ling, Q., Yuan, K., Wu, G., Yin, W.: On the linear convergence of the ADMM in decentralized consensus optimization. IEEE Trans. Signal Process. 62(7), 1750–1761 (2014)
Article MathSciNet Google Scholar
Mokhtari, A., Shi, W., Ling, Q., Ribeiro, A.: A decentralized second-order method with exact linear convergence rate for consensus optimization. IEEE Trans. Signal Inf. Process. Netw. 2(4), 507–522 (2016)
Article MathSciNet Google Scholar
Chen, A.I., Ozdaglar, A.: A fast distributed proximal-gradient method. In: 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 601–608. IEEE (2012)
Jakovetić, D., Xavier, J., Moura, J.M.: Fast distributed gradient methods. IEEE Trans. Autom. Control 59(5), 1131–1146 (2014)
Article MathSciNet Google Scholar
Berahas, A.S., Bollapragada, R., Keskar, N.S., Wei, E.: Balancing communication and computation in distributed optimization. IEEE Trans. Autom. Control 64(8), 3141–3155 (2018)
Article MathSciNet Google Scholar
Manitara, N.E., Hadjicostis, C.N.: Distributed stopping for average consensus in undirected graphs via event-triggered strategies. Automatica 70, 121–127 (2016)
Article MathSciNet Google Scholar
Manitara, N.E., Hadjicostis, C.N.: Distributed stopping for average consensus in digraphs. IEEE Trans. Control of Netw. Syst. 5(3), 957–967 (2018)
Article MathSciNet Google Scholar
Yadav, V., Salapaka, M.V.: Distributed protocol for determining when averaging consensus is reached. In: 45th Annual Allerton Conference, pp. 715–720 (2007)
Sundaram, S., Hadjicostis, C.N.: Finite-time distributed consensus in graphs with time-invariant topologies. In: 2007 American Control Conference, pp. 711–716. IEEE (2007)
Mou, S., Morse, A.S.: Finite-time distributed averaging. In: 2014 American Control Conference, pp. 5260–5263. IEEE (2014)
Nesterov, Y.: Introductory lectures on convex programming Volume I: basic course. Lect. Notes 3(4), 5 (1998)
Google Scholar
Nedic, A., Ozdaglar, A., Parrilo, P.A.: Constrained consensus and optimization in multi-agent networks. IEEE Trans. Autom. Control 55(4), 922–938 (2010)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

School of Aeronautics and Astronautics, Purdue University, West-Lafayette, IN, 47906, USA
Bin Du, Jiazhen Zhou & Dengfeng Sun

Authors

Bin Du
View author publications
You can also search for this author in PubMed Google Scholar
Jiazhen Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Dengfeng Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bin Du.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Proof of Theorem 3.1

We begin the proof by introducing the following additional notations. Recalling that ${\mathbf {x}}^{k+1/2}$ represents the intermediate result obtained by a pure gradient step, as shown in (5), we denote the exact average of ${\mathbf {x}}^k$ and ${\mathbf {x}}^{k+1/2}$ as $\bar{{\mathbf {x}}}^k$ and $\bar{{\mathbf {x}}}^{k+1/2}$, respectively, i.e.,

$$\begin{aligned} \bar{{\mathbf {x}}}^k&= \frac{1}{n}\sum _{i=1}^n {\mathbf {x}}_i^k\quad \text {and}\quad \bar{{\mathbf {x}}}^{k+1/2} = \frac{1}{n}\sum _{i=1}^n {\mathbf {x}}_i^{k+1/2}. \end{aligned}$$

(25)

Moreover, we use $\bar{{\mathbf {g}}}^k$ to denote the global gradient at the $\bar{{\mathbf {x}}}^k$ and use ${\mathbf {g}}^k$ to denote the average of local gradient at ${\mathbf {x}}^k$, i.e.,

$$\begin{aligned} \bar{{\mathbf {g}}}^k = \frac{1}{n}\sum _{i=1}^n \nabla {f}_i(\bar{{\mathbf {x}}}^k)\quad \text {and}\quad {\mathbf {g}}^k&= \frac{1}{n}\sum _{i=1}^n \nabla {f}_i({\mathbf {x}}_i^k). \end{aligned}$$

(26)

Next, we show by the following Lemma A.1 that, once the $\varepsilon $-inexact average consensus is guaranteed by ${\mathcal {C}}^{\varepsilon ^k}$, the distance $\Vert {{\mathbf {x}}}_i^{k+1} - \bar{{\mathbf {x}}}^{k+1}\Vert $ can be bounded with the error $\varepsilon ^k$.

Lemma A.1

Given that ${\mathbf {x}}^{k+1}$ is the output of ${\mathcal {C}}^{\varepsilon ^k}({\mathbf {x}}^{k+1/2})$ and thus $\Vert {\mathbf {x}}_i^{k+1} - \bar{{\mathbf {x}}}^{k+1/2}\Vert \le \varepsilon ^{k}$ is guaranteed for $\forall i\in {\mathcal {N}}$, then one can have,

$$\begin{aligned} \Vert {{\mathbf {x}}}_i^{k+1} - \bar{{\mathbf {x}}}^{k+1}\Vert \le 2\varepsilon ^{k}, \quad \forall i \in {\mathcal {N}}. \end{aligned}$$

(27)

Proof

By the definition of $\bar{{\mathbf {x}}}^{k+1}$, it holds that

$$\begin{aligned} \begin{aligned} \Vert \bar{{\mathbf {x}}}^{k+1} - \bar{{\mathbf {x}}}^{k+1/2}\Vert&= \Vert \frac{1}{n}\left( \sum _{i=1}^n{\mathbf {x}}_i^{k+1} - \bar{{\mathbf {x}}}^{k+1/2}\right) \Vert \\&\le \frac{1}{n}\sum _{i=1}^n\Vert {\mathbf {x}}_i^{k+1} - \bar{{\mathbf {x}}}^{k+1/2}\Vert \\&\le \varepsilon ^{k}. \end{aligned} \end{aligned}$$

(28)

Therefore, for $\forall i \in {\mathcal {N}}$, one can have,

$$\begin{aligned} \begin{aligned} \Vert {{\mathbf {x}}}_i^{k+1} - \bar{{\mathbf {x}}}^{k+1}\Vert \le \Vert {{\mathbf {x}}}_i^{k+1} - \bar{{\mathbf {x}}}^{k+1/2} \Vert +\Vert \bar{{\mathbf {x}}}^{k+1/2}- \bar{{\mathbf {x}}}^{k+1}\Vert \le 2\varepsilon ^{k}, \end{aligned} \end{aligned}$$

(29)

$\square $

With the help of Lemma A.1, we next bound the distance between $\bar{{\mathbf {x}}}^k$ and the optimizer ${\mathbf {x}}^\star $. Before doing that, we will need a few supporting lemmas.

Lemma A.2

Suppose that Assumption 1 holds, then one can have

$$\begin{aligned} \Vert {\mathbf {g}}^k - \bar{{\mathbf {g}}}^k\Vert \le c_2 \varepsilon ^{k-1}, \end{aligned}$$

(30)

where $c_2 = 2L$.

Proof

By the definitions of ${\mathbf {g}}^k$ and $\bar{{\mathbf {g}}}^k$, as shown in (26), it holds that

$$\begin{aligned} \begin{aligned} \Vert {\mathbf {g}}^k - \bar{{\mathbf {g}}}^k\Vert&\le \frac{1}{n}\sum _{i=1}^n\Vert \nabla f_i({\mathbf {x}}_i^k) - \nabla f_i(\bar{{\mathbf {x}}}^k) \Vert \\&\overset{(A.2.\mathrm {a})}{\le } \frac{1}{n}\sum _{i=1}^n L_i\Vert {\mathbf {x}}_i^k - \bar{{\mathbf {x}}}^k\Vert \\&\overset{(A.2.\mathrm {b})}{\le } c_2 \varepsilon ^{k-1}, \end{aligned} \end{aligned}$$

(31)

where (A.2.a) is due to Assumption 1 and (A.2.b) follows from Lemma A.1. $\square $

Lemma A.3

Suppose that Assumptions 1 and 2 hold, then for $\forall {\mathbf {x}}, {\mathbf {y}} \in {\mathbb {R}}^p$, we can have,

$$\begin{aligned} \begin{aligned} (\nabla F({\mathbf {x}}) - \nabla F({\mathbf {y}}))^T ({\mathbf {x}}-{\mathbf {y}}) \ge \frac{1}{\mu +L}\Vert \nabla F({\mathbf {x}}) - \nabla F({\mathbf {y}})\Vert ^2 + \frac{\mu L}{\mu +L}\Vert {\mathbf {x}}-{\mathbf {y}}\Vert ^2. \end{aligned} \end{aligned}$$

(32)

Proof

The proof can be found in Theorem 2.1.11 in [22] and thus omitted. $\square $

Now, we bound the distance $\Vert \bar{{\mathbf {x}}}^{k+1}-{\mathbf {x}}^\star \Vert $ by the following lemma.

Lemma A.4

Suppose that Assumptions 1 and 2 hold, and let the constant step size satisfy $\alpha \le 2/(\mu + L)$, then one can have

$$\begin{aligned} \Vert \bar{{\mathbf {x}}}^{k+1} - {\mathbf {x}}^\star \Vert \le c_3\Vert \bar{{\mathbf {x}}}^{k} - {\mathbf {x}}^\star \Vert + c_4 \varepsilon ^{k-1}, \end{aligned}$$

(33)

where $c_3 = \sqrt{1-2\alpha \mu L/(\mu +L)} < 1$ and $c_4 = \alpha c_2 +1$.

Proof

Consider,

$$\begin{aligned} \begin{aligned} \Vert \bar{{\mathbf {x}}}^k - {\mathbf {x}}^\star - \alpha \bar{{\mathbf {g}}}^k\Vert ^2&= \Vert \bar{{\mathbf {x}}}^k - {\mathbf {x}}^\star \Vert ^2 + \alpha ^2 \Vert \bar{{\mathbf {g}}}^k\Vert ^2 -2\alpha (\bar{{\mathbf {x}}}^k - {\mathbf {x}}^\star )^T\bar{{\mathbf {g}}}^k\\&\overset{(A.4.\mathrm {a})}{\le } \Vert \bar{{\mathbf {x}}}^k - {\mathbf {x}}^\star \Vert ^2 + \alpha ^2 \Vert \bar{{\mathbf {g}}}^k\Vert ^2 - 2\alpha \Big (\frac{1}{\mu + L}\Vert \bar{{\mathbf {g}}}^k\Vert ^2 + \frac{\mu L}{\mu +L}\Vert \bar{{\mathbf {x}}}^k - {\mathbf {x}}^\star \Vert ^2\Big )\\&= (1-\frac{2\alpha \mu L}{\mu +L})\Vert \bar{{\mathbf {x}}}^k - {\mathbf {x}}^\star \Vert ^2 + \alpha (\alpha - \frac{2}{\mu + L})\Vert \bar{{\mathbf {g}}}^k\Vert ^2\\& \overset{(A.4.\mathrm {b})}{\le } c_3^2\Vert \bar{{\mathbf {x}}}^k - {\mathbf {x}}^\star \Vert ^2, \end{aligned} \end{aligned}$$

(34)

where (A.4.a) follows from Lemma A.3 and (A.4.b) is due to the assumption $\alpha \le 2/(\mu + L)$. Therefore, it holds that

$$\begin{aligned} \begin{aligned} \Vert \bar{{\mathbf {x}}}^{k+1} - {\mathbf {x}}^\star \Vert&= \Vert \bar{{\mathbf {x}}}^{k+1} - \bar{{\mathbf {x}}}^{k+1/2} + \bar{{\mathbf {x}}}^{k+1/2} - {\mathbf {x}}^\star \Vert \\&\overset{(A.4.\mathrm {c})}{\le } \Vert \bar{{\mathbf {x}}}^{k+1} - \bar{{\mathbf {x}}}^{k+1/2}\Vert + \Vert \bar{{\mathbf {x}}}^{k} - {\mathbf {x}}^\star - \alpha {\mathbf {g}}^k\Vert \\&\overset{(A.4.\mathrm {d})}{\le } \varepsilon ^{k} + \Vert \bar{{\mathbf {x}}}^{k} - {\mathbf {x}}^\star - \alpha \bar{{\mathbf {g}}}^k\Vert + \alpha \Vert {{\mathbf {g}}}^k-\bar{{\mathbf {g}}}^k\Vert \\&\overset{(A.4.\mathrm {e})}{\le } \varepsilon ^{k} + c_3 \Vert \bar{{\mathbf {x}}}^k - {\mathbf {x}}^\star \Vert + \alpha c_2 \varepsilon ^{k-1}\\&\overset{(A.4.\mathrm {f})}{\le } c_4 \varepsilon ^{k-1} + c_3 \Vert \bar{{\mathbf {x}}}^k - {\mathbf {x}}^\star \Vert , \end{aligned} \end{aligned}$$

(35)

where (A.4.c) is due to the fact that $\bar{{\mathbf {x}}}^{k+1/2} = \bar{{\mathbf {x}}}^{k} - \alpha {\mathbf {g}}^k$, (A.4.d) follows from Lemma A.1, (A.4.e) is due to (34) and Lemma A.2, and (A.4.f) is due to the assumption $\varepsilon ^{k-1} \ge \varepsilon ^{k}$. $\square $

Here, we introduce the last supporting lemma, which states the convergence of product of two decaying sequences.

Lemma A.5

Given a constant $0<\lambda <1$ and a positive scalar sequence $\{\beta ^k\}_{k\in {\mathbb {N}}_+}$ which has $\lim _{k \rightarrow \infty } \beta ^k = 0$, then one can have,

$$\begin{aligned} \lim _{k \rightarrow \infty } \sum _{t=0}^{k}(\lambda )^{k-t}\beta ^t = 0. \end{aligned}$$

(36)

Proof

The proof can be found in Lemma 7 in [23], and thus omitted. $\square $

With the help of all the above lemmas, we are now ready to prove Theorem 3.1. By the result obtained from Lemma A.4, it holds that

$$\begin{aligned} \begin{aligned} \Vert \bar{{\mathbf {x}}}^{k+1} - {\mathbf {x}}^\star \Vert&\le c_4 \varepsilon ^{k-1} + c_3\Vert \bar{{\mathbf {x}}}^{k} - {\mathbf {x}}^\star \Vert \\&\le (c_3)^{k}\Vert \bar{{\mathbf {x}}}^{1} - {\mathbf {x}}^\star \Vert + c_4\sum _{t=0}^{k-1} (c_3)^{k-1-t}\varepsilon ^t. \end{aligned} \end{aligned}$$

(37)

According to Lemma A.5 and the fact that $c_3 <1$, as shown in Lemma A.4, we can have $\lim _{k \rightarrow \infty } \Vert \bar{{\mathbf {x}}}^{k+1} - {\mathbf {x}}^\star \Vert = 0$. Furthermore, it holds that

$$\begin{aligned} \begin{aligned} \lim _{k \rightarrow \infty }\Vert {\mathbf {x}}_i^{k+1} - {\mathbf {x}}^\star \Vert&\le \lim _{k \rightarrow \infty }\Vert {\mathbf {x}}_i^{k+1}-\bar{{\mathbf {x}}}^{k+1}\Vert + \lim _{k \rightarrow \infty }\Vert \bar{{\mathbf {x}}}^{k+1}-{\mathbf {x}}^\star \Vert \\&\overset{(6.\mathrm {a})}{\le } \lim _{k \rightarrow \infty }2 \varepsilon ^{k} + \lim _{k \rightarrow \infty }(c_3)^{k}\Vert \bar{{\mathbf {x}}}^{1} - {\mathbf {x}}^\star \Vert + \lim _{k \rightarrow \infty }c_4\sum _{t=0}^{k-1} (c_3)^{k-1-t}\varepsilon ^t\\&\overset{(6.\mathrm {b})}{=} 0, \end{aligned} \end{aligned}$$

(38)

where (6.a) is due to (37) and Lemma A.1, and (6.b) follows from the fact $c_3 < 1$ and Lemma A.5. Therefore, the statement 1) in Theorem 3.1 is proved.

At last, we consider statement 2) where the sequence of consensus error bounds decays linearly, i.e., there exist constants $c_0 > 0$ and $0< \rho _0 <1$ such that $\varepsilon ^k \le c_0 (\rho _0)^k$. Following the same path that we proved statement 1), let us first show that the distance $\Vert \bar{{\mathbf {x}}}^{k+1}-{\mathbf {x}}^\star \Vert $ converges to zero at a linear rate, i.e., there exist constants $c' > 0$ and $0< \rho ' = \max \{\rho _0, c_3 + c_4c_0/(c'\rho _0)\} <1$ such that

$$\begin{aligned} \Vert \bar{{\mathbf {x}}}^{k+1} - {\mathbf {x}}^\star \Vert \le c' (\rho ')^{k+1}. \end{aligned}$$

(39)

Recall inequality (33) obtained by Lemma A.4, we prove the statement by induction. First, (39) apparently holds when $k=0$. Suppose that (39) holds at the kth iteration, we now consider the $(k+1)$th iteration and have,

$$\begin{aligned} \begin{aligned} \Vert \bar{{\mathbf {x}}}^{k+1} - {\mathbf {x}}^\star \Vert&\le c_4 \varepsilon ^{k-1} + c_3\Vert \bar{{\mathbf {x}}}^{k} - {\mathbf {x}}^\star \Vert \\&\le c_4c_0 \cdot (\rho _0)^{k-1} + c_3c'\cdot (\rho ')^{k}\\&= c' (\rho ')^k \Big (c_3 + \frac{c_4c_0}{c'\rho _0}(\frac{\rho _0}{\rho '})^k\Big )\\& \overset{(7.\mathrm {a})}{\le } c'(\rho ')^{k+1}, \end{aligned} \end{aligned}$$

(40)

where (7.a) is due to the fact $\rho ' = \max \{\rho _0, c_3 + c_4c_0/(c'\rho _0)\}$. Now, following the same procedure as shown in (38), there exist constants $c_1 = \max \{2c_0/\rho _0, c'\}$ and $\rho _1 = \max \{\rho _0, \rho '\}$ such that

$$\begin{aligned} \begin{aligned} \Vert {\mathbf {x}}_i^{k+1} - {\mathbf {x}}^\star \Vert&\le \Vert {\mathbf {x}}_i^{k+1}-\bar{{\mathbf {x}}}^{k+1}\Vert + \Vert \bar{{\mathbf {x}}}^{k+1}-{\mathbf {x}}^\star \Vert \\& \overset{(8.\mathrm {a})}{\le } 2c_0\cdot (\rho _0)^{k} + c'\cdot (\rho ')^{k+1}\\&\le c_1\cdot (\rho _1)^{k+1}, \end{aligned} \end{aligned}$$

(41)

where (8.a) follows from Lemma A.1. Therefore, the proof is completed. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Du, B., Zhou, J. & Sun, D. Improving the Convergence of Distributed Gradient Descent via Inexact Average Consensus. J Optim Theory Appl 185, 504–521 (2020). https://doi.org/10.1007/s10957-020-01648-3

Download citation

Received: 29 September 2019
Accepted: 19 February 2020
Published: 17 March 2020
Issue Date: May 2020
DOI: https://doi.org/10.1007/s10957-020-01648-3

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving the Convergence of Distributed Gradient Descent via Inexact Average Consensus

Abstract

Access this article

Similar content being viewed by others

Hybrid ADMM: a unifying and fast approach to decentralized optimization

Distributed Average Consensus in Digraphs

Distributed stochastic gradient tracking methods

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix: Proof of Theorem 3.1

Lemma A.1

Proof

Lemma A.2

Proof

Lemma A.3

Proof

Lemma A.4

Proof

Lemma A.5

Proof

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Improving the Convergence of Distributed Gradient Descent via Inexact Average Consensus

Abstract

Access this article

Similar content being viewed by others

Hybrid ADMM: a unifying and fast approach to decentralized optimization

Distributed Average Consensus in Digraphs

Distributed stochastic gradient tracking methods

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix: Proof of Theorem 3.1

Appendix: Proof of Theorem 3.1

Lemma A.1

Proof

Lemma A.2

Proof

Lemma A.3

Proof

Lemma A.4

Proof

Lemma A.5

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation