Abstract
This paper investigates decentralized composite optimization problems involving a common non-smooth regularization term over an undirected and connected network. In the same situation, there exist lots of gradient-based proximal distributed methods, but most of them are only sublinearly convergent. The proof of linear convergence for this series of algorithms is extremely difficult. To set up the problem, we presume all networked agents use the same non-smooth regularization term, which is the circumstance for most machine learning to implement based on centralized optimization. For this scenario, most existing proximal-gradient algorithms trend to ignore the cost of gradient evaluations, which results in degraded performance. To tackle this problem, we further set the local cost function to the average of a moderate amount of local cost subfunctions and develop an edge-based stochastic proximal gradient algorithm (SPG-Edge) by employing local unbiased stochastic averaging gradient method. When the non-smooth term does not exist, the proposed algorithm could be extended to some notable primal-dual domain algorithms, such as EXTRA and DIGing. Finally, we provide a simplified proof of linear convergence and conduct numerical experiments to illustrate the validity of theoretical results.
Similar content being viewed by others
References
A. Nedic, A. Olshevsky, and W. Shi, “Achieving geometric convergence for distributed optimization over time-varying graphs,” SIAM Journal on Optimization, vol. 27, no. 4, pp. 2597–2633, 2017.
W. Shi, Q. Ling, G. Wu, and W. Yin, “EXTRA: An exact first-order algorithm for decentralized consensus optimization,” SIAM Journal on Optimization, vol. 25, no. 2, pp. 944–966, 2015.
I. Schizas, A. Ribeiro, and G. Giannakis, “Consensus in ad hoc WSNs with noisy links-part I: Distributed estimation of deterministic signals,” IEEE Transactions on Signal Processing, vol. 56, no. 1, pp. 350–364, 2008.
N. Aybat, Z. Wang, T. Lin, and S. Ma, “Distributed linearized alternating direction method of multipliers for composite convex consensus optimization,” IEEE Transactions on Automatic Control, vol. 63, no. 1, pp. 5–20, 2018.
W. Shi, Q. Ling, K. Yuan, G. Wu, and W. Yin, “On the linear convergence of the ADMM in decentralized consensus optimization,” IEEE Transactions on Signal Processing, vol. 62, pp. 1750–1761, 2014.
F. Yan, S. Sundaram, S. Vishwanathan, and Y. Qi, “Distributed autonomous online learning: Regrets and intrinsic privacy-preserving properties,” IEEE Trans. Knowl. Data Eng., vol. 25, no. 11, pp. 2483–2493, 2013.
W. Shi, Q. Ling, G. Wu, and W. Yin, “A proximal gradient algorithm for decentralized composite optimization,” IEEE Transactions on Signal Processing, vol. 63, no. 22, pp. 6013–6023, 2015.
A. H. Sayed, “Adaptation, learning, and optimization over networks,” Foundations and Trends in Machine Learning, vol. 7, no. 4–5, pp. 311–801, 2014.
K. Yuan, B. Ying, X. Zhao, and A. Sayed, “Exact diffusion for distributed optimization and learning-Part I: Algorithm development,” IEEE Transactions on Signal Processing, vol. 67, no. 3, pp. 708–723, 2019.
V. Smith, S. Forte, M. Chenxin, M. Takac, M. Jordan, and M. Jaggi, “CoCoA: A general framework for communication efficient distributed optimization,” Journal of Machine Learning Research, vol. 18, no. 1, pp. 8590–8638, 2018.
R. Johnson and T. Zhang, “Accelerating stochastic gradient descent using predictive variance reduction,” Advances in Neural Information Processing Systems, pp. 315–323, 2013.
A. Defazio, F. Bach, and S. Lacoste-Julien, “Saga: A fast incremental gradient method with support for non-strongly convex composite objectives,” Advances in Neural Information Processing Systems, pp. 1646–1654, 2014.
A. Mokhtari and A. Ribeiro, “DSA: Decentralized double stochastic averaging gradient algorithm,” The Journal of Machine Learning Research, vol. 17, pp. 1–35, 2016.
A. Mokhtari, Q. Ling, and A. Ribeiro, “Network Newton distributed optimization methods,” IEEE Transactions on Signal Processing, vol. 65, no. 1, pp. 146–161, 2016.
K. Seaman, F. Bach, S. Bubeck, Y. T. Lee, and L. Massoulie, “Optimal algorithms for smooth and strongly convex distributed optimization in networks,” Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 3027–3036, 2017.
J. Mota, J. Xavier, and P. Aguiar, “DADMM: A communication-efficient distributed algorithm for separable optimization,” IEEE Transactions on Signal Processing, vol. 61, no. 10, pp. 2718–2723, 2013.
G. Mateos, J. Bazerque, and G. Giannakis, “Distributed sparse linear regression,” IEEE Transactions on Signal Processing, vol. 58, no. 10, pp. 5262–5276, 2010.
I. Boulkaibet, K. Belarbi, S. Bououden, M. Chadli, and T. Marwala, “An adaptive fuzzy predictive control of nonlinear processes based on multi-kernel least squares support vector regression,” Applied Soft Computing, vol. 73, pp. 572–590, 2018.
J. Tsitsiklis, D. Bertsekas, and M. Athans, “Distributed asynchronous deterministic and stochastic gradient optimization algorithms,” IEEE Transactions on Automatic Control, vol. 31, no. 9, pp. 803–812, 1986.
Q. Ling and Z. Tian, “Decentralized sparse signal recovery for compressive sleeping wireless sensor networks,” IEEE Transactions on Signal Processing, vol. 58, no. 7, pp. 3816–3827, 2010.
S. Safavi and U. A. Khan, “An opportunistic linear-convex algorithm for localization in mobile robot networks,” IEEE Transactions on Robotics, vol. 33, no. 4, pp. 875–888, 2017.
T.-H. Chang, M. Hong, and X. Wang, “Multi-agent distributed optimization via inexact consensus ADMM,” IEEE Transactions on Signal Processing, vol. 63, no. 2, pp. 482–497, 2015.
Z. Wang, L. Zheng, and H. Li, “Distributed optimization over general directed networks with random sleep scheme,” International Journal of Control, Automation and Systems, vol. 18, no. 10, pp. 2534–2542, 2020.
L. He, A. Bian, and M. Jaggi, “COLA: Decentralized linear learning,” Advances in Neural Information Processing Systems, pp. 4536–4546, 2018.
Z. Wang and H. Li, “Edge-based stochastic gradient algorithm for distributed optimization,” IEEE Transactions on Network Science and Engineering, vol. 7, no. 3, pp. 1421–1430, 2020.
W. S. Sui, G. R. Duan, M. Z. Hou, and M. R. Zhang, “Distributed fixed-time attitude synchronization control for multiple rigid spacecraft,” International Journal of Control, Automation and Systems, vol. 17, no. 6, pp. 1117–1130, 2019.
Q. Zhang, Z. Gong, Z. Yang, and J. Chen, “Distributed convex optimization for flocking of nonlinear multi-agent systems,” International Journal of Control, Automation and Systems, vol. 17, no. 5, pp. 1177–1183, 2019.
S. Alghunaim, K. Yuan, and A. Sayed, “A linearly convergent proximal gradient algorithm for decentralized optimization,” Advances in Neural Information Processing Systems, pp. 2844–2854, 2019.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Recommended by Associate Editor Shun-ichi Azuma under the direction of Editor Yoshito Ohta.
The work described in this paper was supported in part by the Open Research Fund Program of Data Recovery Key Laboratory of Sichuan Province (Grant no. DRN2001) and in part by the National Natural Science Foundation of China (Grant no. 61773321).
Ling Zhang is currently pursuing an undergraduate degree in the School of Electronic and Information Engineering in Southwest University, Chongqing, China. Her main research interests include distributed optimization, artificial intelligence, multi-agent systems, and integrated circuit.
Yu Yan received her B.S. and M.S. degrees from Chongqing Medical University, Chongqing, China, in 2014 and 2016, respectively. She is currently a research assistant at the Chongqing Key Laboratory of Nonlinear Circuits and Intelligent Information Processing, College of Electronic and Information Engineering, Southwest University, Chongqing, China. Her research interests include distributed optimization and medical image processing.
Zheng Wang received his B.E. degree in electronic information science and technology from the University of Jinan, Jinan, China, in 2015, and an M.E. degree in electronics and communication engineering from Southwest University, Chongqing, China, in 2018. He is currently pursuing a Ph.D. degree with the School of Electrical Engineering and Telecommunications, University of New South Wales, Sydney, NSW, Australia. His research interests include multiagent systems, distributed optimization, and their applications in smart grid.
Huaqing Li received his B.S. degree in information and computing science from Chongqing University of Posts and Telecommunications, in 2009, and a Ph.D. degree in computer science and technology from Chongqing University in 2013. He was a Postdoctoral Researcher at the School of Electrical and Information Engineering, The University of Sydney from Sept. 2014 to Sept. 2015, and at the School of Electrical and Electronic Engineering, Nanyang Technological University from Nov. 2015 to Nov. 2016. From Jul. 2018, he was a professor at the College of Electronic and Information Engineering, Southwest University. His main research interests include nonlinear dynamics and control, multi-agent systems, and distributed optimization. He serves as a Regional Editor for Neural Computing, Applications and an Editorial Board Member for IEEE ACCESS.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, L., Yan, Y., Wang, Z. et al. An Edge-based Stochastic Proximal Gradient Algorithm for Decentralized Composite Optimization. Int. J. Control Autom. Syst. 19, 3598–3610 (2021). https://doi.org/10.1007/s12555-020-0483-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12555-020-0483-9