A distributed stochastic optimization algorithm with gradient-tracking and distributed heavy-ball acceleration

Sun, Bihao; Hu, Jinhui; Xia, Dawen; Li, Huaqing

doi:10.1631/FITEE.2000615

A distributed stochastic optimization algorithm with gradient-tracking and distributed heavy-ball acceleration

基于梯度跟踪和分布式重球加速的分布式随机优化算法

Research Article
Published: 24 July 2021

Volume 22, pages 1463–1476, (2021)
Cite this article

Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

149 Accesses
3 Citations
Explore all metrics

Abstract

Distributed optimization has been well developed in recent years due to its wide applications in machine learning and signal processing. In this paper, we focus on investigating distributed optimization to minimize a global objective. The objective is a sum of smooth and strongly convex local cost functions which are distributed over an undirected network of n nodes. In contrast to existing works, we apply a distributed heavy-ball term to improve the convergence performance of the proposed algorithm. To accelerate the convergence of existing distributed stochastic first-order gradient methods, a momentum term is combined with a gradient-tracking technique. It is shown that the proposed algorithm has better acceleration ability than GT-SAGA without increasing the complexity. Extensive experiments on real-world datasets verify the effectiveness and correctness of the proposed algorithm.

摘要

由于在机器学习和信号处理中的广泛应用, 近年来分布式优化得到良好发展. 本文致力于研究分布式优化以求解目标函数全局最小值. 该目标是分布在个节点的无向网络上的平滑且强凸的局部成本函数总和. 与已有工作不同的是, 我们使用分布式重球项以提高算法的收敛性能. 为使现有分布式随机一阶梯度算法的收敛加速, 将动量项与梯度跟踪技术结合. 仿真结果表明, 在不增加复杂度的情况下, 所提算法具有比GT-SAGA更高收敛速率. 在真实数据集上的数值实验证明了该算法的有效性和正确性.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Distributed stochastic gradient tracking methods with momentum acceleration for non-convex optimization

Article 23 November 2022

Juan Gao, Xin-Wei Liu, … Junhua Gu

PRIAG: Proximal Reweighted Incremental Aggregated Gradient Algorithm for Distributed Optimizations

A distributed gradient algorithm based on randomized block-coordinate and projection-free over networks

Article Open access 28 June 2022

Junlong Zhu, Xin Wang, … Qingtao Wu

References

Bertsekas D, Gafni E, 1983. Projected Newton methods and optimization of multicommodity flows. IEEE Trans Autom Contr, 28(12):1090–1096. https://doi.org/10.1109/TAC.1983.1103183
Article MathSciNet Google Scholar
Boyd S, Parikh N, Chu E, et al., 2011. Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends® Mach Learn, 3(1):1–122. https://doi.org/10.1561/2200000016
MATH Google Scholar
Cheng B, Li ZK, 2019. Coordinated tracking control with asynchronous edge-based event-triggered communications. IEEE Trans Autom Contr, 64(10):4321–4328. https://doi.org/10.1109/TAC.2019.2895927
Article MathSciNet Google Scholar
Cheng S, Chen MY, Wai RJ, et al., 2014. Optimal placement of distributed generation units in distribution systems via an enhanced multi-objective particle swarm optimization algorithm. J Zhejiang Univ-Sci C (Comput & Electron), 15(4):300–311. https://doi.org/10.1631/jzus.C1300250
Article Google Scholar
Cohen K, Nedić A, Srikant R, 2017. Distributed learning algorithms for spectrum sharing in spatial random access wireless networks. IEEE Trans Autom Contr, 62(6):2854–2869. https://doi.org/10.1109/TAC.2016.2626578
Article MathSciNet Google Scholar
Defazio A, Bach F, Lacoste-Julien S, 2014. SAGA: a fast incremental gradient method with support for non-strongly convex composite objectives. Proc 27^th Int Conf on Neural Information Processing Systems, p.1646–1654.
Dua D, Graff C, 2017. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml
Duchi JC, Agarwal A, Wainwright MJ, 2012. Dual averaging for distributed optimization: convergence analysis and network scaling. IEEE Trans Autom Contr, 57(3):592–606. https://doi.org/10.1109/TAC.2011.2161027
Article MathSciNet Google Scholar
Eisen M, Mokhtari A, Ribeiro A, 2017. Decentralized quasi-Newton methods. IEEE Trans Signal Process, 65(10):2613–2628. https://doi.org/10.1109/TSP.2017.2666776
Article MathSciNet Google Scholar
Erseghe T, Zennaro D, Dall’Anese E, et al., 2011. Fast consensus by the alternating direction multipliers method. IEEE Trans Signal Process, 59(11):5523–5537. https://doi.org/10.1109/TSP.2011.2162831
Article MathSciNet Google Scholar
Guan L, Sun T, Qiao LB, et al., 2020. An efficient parallel and distributed solution to nonconvex penalized linear SVMs. Front Inform Technol Electron Eng, 21(4):587–603. https://doi.org/10.1631/FITEE.1800566
Article Google Scholar
Han ZM, Lin ZY, Fu MY, et al., 2015. Distributed coordination in multi-agent systems: a graph Laplacian perspective. Front Inform Technol Electron Eng, 16(6):429–448. https://doi.org/10.1631/FITEE.1500118
Article Google Scholar
Hu JH, Yan Y, Li HQ, et al., 2021. Convergence of an accelerated distributed optimisation algorithm over time-varying directed networks. IET Contr Theory Appl, 15(1):24–39. https://doi.org/10.1049/cth2.12022
Article Google Scholar
Johnson R, Zhang T, 2013. Accelerating stochastic gradient descent using predictive variance reduction. Proc 26^th Int Conf on Neural Information Processing Systems, p.315–323.
Lan Q, Qiao LB, Wang YJ, 2018. Stochastic extra-gradient based alternating direction methods for graph-guided regularized minimization. Front Inform Technol Electron Eng, 19(6):755–762. https://doi.org/10.1631/FITEE.1601771
Article Google Scholar
Li HQ, Cheng HQ, Wang Z, et al., 2020. Distributed Nesterov gradient and heavy-ball double accelerated asynchronous optimization. IEEE Trans Neur Netw Learn Syst, in press. https://doi.org/10.1109/TNNLS.2020.3027381
Li Z, Shi W, Yan M, 2019. A decentralized proximal-gradient method with network independent step-sizes and separated convergence rates. IEEE Trans Signal Process, 67(17):4494–4506. https://doi.org/10.1109/TSP.2019.2926022
Article MathSciNet Google Scholar
Ling Q, Ribeiro A, 2014. Decentralized dynamic optimization through the alternating direction method of multipliers. IEEE Trans Signal Process, 62(5):1185–1197. https://doi.org/10.1109/TSP.2013.2295055
Article MathSciNet Google Scholar
Ling Q, Tian Z, 2010. Decentralized sparse signal recovery for compressive sleeping wireless sensor networks. IEEE Trans Signal Process, 58(7):3816–3827. https://doi.org/10.1109/TSP.2010.2047721
Article MathSciNet Google Scholar
Liu R, Sun WC, Hou T, et al., 2019. Block coordinate descentwith time perturbation for nonconvex nonsmooth problems in real-world studies. Front Inform Technol Electron Eng, 20(10):1390–1403. https://doi.org/10.1631/FITEE.1900341
Article Google Scholar
Lü QG, Liao XF, Li HQ, et al., 2020. A Nesterov-like gradient tracking algorithm for distributed optimization over directed networks. IEEE Trans Syst Man Cybern, in press. https://doi.org/10.1109/TSMC.2019.2960770
Mateos G, Bazerque JA, Giannakis GB, 2010. Distributed sparse linear regression. IEEE Trans Signal Process, 58(10):5262–5276. https://doi.org/10.1109/TSP.2010.2055862
Article MathSciNet Google Scholar
Matthews TP, Wang K, Li CP, et al., 2016. Nonlinear waveform inversion by use of the regularized dual averaging method for ultrasound computed tomography. Progress in Electromagnetic Research Symp, p.3948. https://doi.org/10.1109/PIERS.2016.7735487
McMahan B, Moore E, Ramage D, et al., 2017. Communication-efficient learning of deep networks from decentralized data. Proc 20^th Int Conf on Artificial Intelligence and Statistics, p.1273–1282.
Nedic A, Ozdaglar A, 2009. Distributed subgradient methods for multi-agent optimization. IEEE Trans Autom Contr, 54(1):48–61. https://doi.org/10.1109/TAC.2008.2009515
Article MathSciNet Google Scholar
Nedic A, Olshevsky A, Shi W, 2017a. Achieving geometric convergence for distributed optimization over time-varying graphs. SIAM J Optim, 27(4):2597–2633. https://doi.org/10.1137/16M1084316
Article MathSciNet Google Scholar
Nedic A, Olshevsky A, Shi W, et al., 2017b. Geometrically convergent distributed optimization with uncoordinated step-sizes. American Control Conf, p.3950–3955. https://doi.org/10.23919/ACC.2017.7963560
Schmidt M, Le Roux N, Bach F, 2017. Minimizing finite sums with the stochastic average gradient. Math Program, 162(1–2):83–112. https://doi.org/10.1007/s10107-016-1030-6
Article MathSciNet Google Scholar
Tan CH, Ma SQ, Dai YH, et al., 2016. Barzilai-Borwein step size for stochastic gradient descent. Proc 30^th Int Conf on Neural Information Processing Systems, p.685–693.
Tsitsiklis J, Bertsekas D, Athans M, 1986. Distributed asynchronous deterministic and stochastic gradient optimization algorithms. IEEE Trans Autom Contr, 31(9):803–812. https://doi.org/10.1109/TAC.1986.1104412
Article MathSciNet Google Scholar
Wang B, Jiang HY, Fang J, et al., 2018. A proximal ADMM for decentralized composite optimization. IEEE Signal Process Lett, 25(8):1121–1125. https://doi.org/10.1109/LSP.2018.2841648
Google Scholar
Wang Z, Li HQ, 2020. Edge-based stochastic gradient algorithm for distributed optimization. IEEE Trans Netw Sci Eng, 7(3):1421–1430. https://doi.org/10.1109/TNSE.2019.2933177
Article MathSciNet Google Scholar
Wei EM, Ozdaglar A, Jadbabaie A, 2013. A distributed Newton method for network utility maximization—I: algorithm. IEEE Trans Autom Contr, 58(9):2162–2175. https://doi.org/10.1109/TAC.2013.2253218
Article MathSciNet Google Scholar
Xi CG, Khan UA, 2017. DEXTRA: a fast algorithm for optimization over directed graphs. IEEE Trans Autom Contr, 62(10):4980–4993. https://doi.org/10.1109/TAC.2017.2672698
Article MathSciNet Google Scholar
Xia YS, Wang J, 2004. A one-layer recurrent neural network for support vector machine learning. IEEE Trans Syst Man Cybern B, 34(2):1261–1269. https://doi.org/10.1109/TSMCB.2003.822955
Article Google Scholar
Xin R, Khan UA, 2018. A linear algorithm for optimization over directed graphs with geometric convergence. IEEE Contr Syst Lett, 2(3):315–320. https://doi.org/10.1109/LCSYS.2018.2834316
Article MathSciNet Google Scholar
Xin R, Khan UA, 2020. Distributed heavy-ball: a generalization and acceleration of first-order methods with gradient tracking. IEEE Trans Autom Contr, 65(6):2627–2633. https://doi.org/10.1109/TAC.2019.2942513
Article MathSciNet Google Scholar
Xin R, Jakovetić D, Khan UA, 2019a. Distributed Nesterov gradient methods over arbitrary graphs. IEEE Signal Process Lett, 26(8):1247–1251. https://doi.org/10.1109/LSP.2019.2925537
Article Google Scholar
Xin R, Sahu AK, Khan UA, et al., 2019b. Distributed stochastic optimization with gradient tracking over strongly-connected networks. Proc IEEE 58^th Conf on Decision and Control, p.8353–8358. https://doi.org/10.1109/CDC40024.2019.9029217
Xin R, Xi CG, Khan UA, 2019c. FROST—fast row-stochastic optimization with uncoordinated step-sizes. EURASIP J Adv Signal Process, 2019(1):1. https://doi.org/10.1186/s13634-018-0596-y
Article Google Scholar
Xin R, Khan UA, Kar S, 2020. Variance-reduced decentralized stochastic optimization with accelerated convergence. IEEE Trans Signal Process, 68:6255–6271. https://doi.org/10.1109/TSP.2020.3031071
Article MathSciNet Google Scholar
Xu JM, Zhu SY, Soh YC, et al., 2015. Augmented distributed gradient methods for multi-agent optimization under uncoordinated constant stepsizes. Proc 54^th IEEE Conf on Decision and Control, p.2055–2060. https://doi.org/10.1109/CDC.2015.7402509
Yin R, Zhang Y, Yu GD, et al., 2010. Centralized and distributed resource allocation in OFDM based multi-relay system. J Zhejiang Univ-Sci C (Comput & Electron), 11(6):450–464. https://doi.org/10.1631/jzus.C0910405
Article Google Scholar
Yuan DM, Ma Q, Wang Z, 2013. Distributed dual averaging method for solving saddle-point problems over multi-agent networks. Proc 32^nd Chinese Control Conf, p.6868–6872.
Zhang CL, Ahmad M, Wang YQ, 2019. ADMM based privacy-preserving decentralized optimization. IEEE Trans Inform Forens Secur, 14(3):565–580. https://doi.org/10.1109/TIFS.2018.2855169
Article Google Scholar
Zinkevich MA, Weimer M, Smola A, et al., 2010. Parallelized stochastic gradient descent. Proc 23^rd Int Conf on Neural Information Processing Systems, p.2595–2603.

Download references

Author information

Authors and Affiliations

Chongqing Key Laboratory of Nonlinear Circuits and Intelligent Information Processing, College of Electronic and Information Engineering, Southwest University, Chongqing, 400715, China
Bihao Sun (孙碧皓), Jinhui Hu (胡锦辉) & Huaqing Li (李华青)
College of Data Science and Information Engineering, Guizhou Minzu University, Guiyang, 550025, China
Dawen Xia (夏大文)

Authors

Bihao Sun (孙碧皓)
View author publications
You can also search for this author in PubMed Google Scholar
Jinhui Hu (胡锦辉)
View author publications
You can also search for this author in PubMed Google Scholar
Dawen Xia (夏大文)
View author publications
You can also search for this author in PubMed Google Scholar
Huaqing Li (李华青)
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Bihao SUN designed the research, processed the data, and drafted the manuscript. Jinhui HU, Dawen XIA, and Huaqing LI helped organize the manuscript and process the data. Bihao SUN and Huaqing LI revised and finalized the paper.

Corresponding author

Correspondence to Huaqing Li (李华青).

Ethics declarations

Bihao SUN, Jinhui HU, Dawen XIA, and Huaqing LI declare that they have no conflict of interest.

Additional information

Project supported by the Open Research Fund Program of Data Recovery Key Laboratory of Sichuan Province, China (No. DRN2001), the National Natural Science Foundation of China (Nos. 61773321 and 61762020), the Science and Technology Top-Notch Talents Support Project of Colleges and Universities in Guizhou Province, China (No. QJHKY2016065), the Science and Technology Foundation of Guizhou Province, China (No. QKHJC20181083), and the Science and Technology Talents Fund for Excellent Young of Guizhou Province, China (No. QKHPTRC20195669)

Huaqing LI received his BS degree in information and computing science in 2009 from Chongqing University of Posts and Telecommunications, Chongqing, China and his PhD degree in computer science and technology in 2013 from Chongqing University. From Sept. 2014 to Sept. 2015, he was a postdoctoral researcher at the School of Electrical and Information Engineering, The University of Sydney, Australia. From Nov. 2015 to Nov. 2016, he was a postdoctoral researcher at the School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore. He is currently a professor at the College of Electronic and Information Engineering, Southwest University, Chongqing, China. His main research interests include nonlinear dynamics and control, multi-agent system, and distributed optimization. Prof. LI currently serves as a regional editor for Neur Comput Appl, an editorial board member for IEEE Access, and a corresponding expert for Front Inform Technol Electron Eng.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sun, B., Hu, J., Xia, D. et al. A distributed stochastic optimization algorithm with gradient-tracking and distributed heavy-ball acceleration. Front Inform Technol Electron Eng 22, 1463–1476 (2021). https://doi.org/10.1631/FITEE.2000615

Download citation

Received: 08 November 2020
Accepted: 15 February 2021
Published: 24 July 2021
Issue Date: November 2021
DOI: https://doi.org/10.1631/FITEE.2000615

Key words

CLC number

TP14

关键词

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A distributed stochastic optimization algorithm with gradient-tracking and distributed heavy-ball acceleration

Abstract

摘要

Access this article

Similar content being viewed by others

Distributed stochastic gradient tracking methods with momentum acceleration for non-convex optimization

PRIAG: Proximal Reweighted Incremental Aggregated Gradient Algorithm for Distributed Optimizations

A distributed gradient algorithm based on randomized block-coordinate and projection-free over networks

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Additional information

Rights and permissions

About this article

Cite this article

Key words

CLC number

关键词

Navigation

A distributed stochastic optimization algorithm with gradient-tracking and distributed heavy-ball acceleration

Abstract

摘要

Access this article

Similar content being viewed by others

Distributed stochastic gradient tracking methods with momentum acceleration for non-convex optimization

PRIAG: Proximal Reweighted Incremental Aggregated Gradient Algorithm for Distributed Optimizations

A distributed gradient algorithm based on randomized block-coordinate and projection-free over networks

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

CLC number

关键词

Search

Navigation