Abstract
This paper reveals that a common and central role, played in many error bound (EB) conditions and a variety of gradient-type methods, is a residual measure operator. On one hand, by linking this operator with other optimality measures, we define a group of abstract EB conditions, and then analyze the interplay between them; on the other hand, by using this operator as an ascent direction, we propose an abstract gradient-type method, and then derive EB conditions that are necessary and sufficient for its linear convergence. The former provides a unified framework that not only allows us to find new connections between many existing EB conditions, but also paves a way to construct new ones. The latter allows us to claim the weakest conditions guaranteeing linear convergence for a number of fundamental algorithms, including the gradient method, the proximal point algorithm, and the forward–backward splitting algorithm. In addition, we show linear convergence for the proximal alternating linearized minimization algorithm under a group of equivalent EB conditions, which are strictly weaker than the traditional strongly convex condition. Moreover, by defining a new EB condition, we show Q-linear convergence of Nesterov’s accelerated forward–backward algorithm without strong convexity. Finally, we verify EB conditions for a class of dual objective functions.
Similar content being viewed by others
References
Artacho, F.J.A., Geoffroy, M.H.: Metric subregularity of the convex subdifferential in Banach spaces. J. Nonlinear Convex A. 15(15), 35–47 (2015)
Attouch, H., Bolte, J.: On the convergence of the proximal algorithm for nonsmooth functions involving analytic features. Math. Program. Ser. B 116, 5–16 (2009)
Attouch, H., Bolte, J., Svaiter, B.F.: Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward–backward splitting, and regularized Gauss–Seidel methods. Math. Program. Ser. A 137(1–2), 91–129 (2013)
Attouch, H., Cabot, A.: Convergence rates of inertial forward–backward algorithms. SIAM J. Optim. 28(1), 849–874 (2018)
Attouch, H., Peypouquet, J.: The rate of convergence of Nesterov’s accelerated forward–backard method is actually \(o(k^{-2})\). SIAM J. Optim. 26(3), 1824–1834 (2016)
Banjac, G., Margellos, K., Goulart, P.J.: On the convergence of a regularized Jacobi algorithm for convex optimization. IEEE Trans. Autom. Control 63(4), 1113–1119 (2018)
Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, Berlin (2011)
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 17(4), 183–202 (2009)
Bertsekas, D.P.: Convex Optimization Theory. Athena Scientific and Tsinghua University Press, Belmont and Beijing (2011)
Bolte, J., Nguyen, T.P., Peypouquet, J., Suter, B.W.: From error bounds to the complexity of first-order descent methods for convex functions. Math. Program. Ser. A 165(2), 471–507 (2017)
Bolte, J., Sabach, S., Teboulle, M.: Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Program. Ser. A 146(1–2), 459–494 (2014)
Brézis, H.: Opérateurs maximaux monotones et semi-groupes de contractions dans les espaces de Hilbert. North-Holland Mathematics Studies, vol. 5. North-Holland Publishing Co., Amsterdam-London (1973)
Bruck, R.E.: Asymptotic convergence of nonlinear contraction semigroups in Hilbert space. J. Funct. Anal. 18(1), 15–26 (1975)
Candés, E.J., Li, X., Soltanolkotabi, M.: Phase retrieval via wirtinger flow: theory and algorithms. IEEE Trans. Inf. Theory 61(4), 1985–2007 (2015)
Chambolle, A., Pock, T.: An introduction to continuous optimization for imaging. Acta Numer. 25, 161–319 (2016)
Cruz, J., Li, G., Nghia, T.: On the Q-linear convergence of forward–backward splitting method and uniqueness of optimal solution to Lasso. Researchgate, November. arXiv:1806.06333v1 [math.OC] (2018)
Dontchev, A.L., Rockafellar, R.T.: Implicit Functions and Solution Mappings. Springer, Berlin (2009)
Drori, Y., Teboulle, M.: Smooth strongly convex interpolation and exact worst-case performance of first-order methods. Math. Program. Ser. A 145(1–2), 451–482 (2014)
Drusvyatskiy, D., Kempton, C.: An accelerated algorithm for minimizaing convex compositions. arXiv:1605.00125v1 [math.OC] 30 Apr 2016
Drusvyatskiy, D., Lewis, A.S.: Error bounds, quadratic growth, and linear convergence of proximal methods. Math. Oper. Res. Published online: 15 Mar (2018)
Drusvyatskiy, D., Mordukhovich, B.S., Nghia, T.T.A.: Second-order growth, tilt stability, and metric regularity of the subdifferential. J. Convex Anal. 21(4), 1165–1192 (2014)
Garrigos, G., Rosasco, L., Villa, S.: Convergence of the forward–backward algorithm: beyond the worst-case with the help of geometry. arXiv:1703.09477v2 [math.OC] 29 Mar 2017
Gong, P., Ye, J.: Linear convergence of variance-reduced projected stochastic gradient without strong convexity. arXiv:1406.1102v2 [cs.NA] 10 July 2015
Hale, E.T., Yin, W.T., Zhang, Y.: Fixed point continuation for \(\ell _1\)-minimization: methodology and convergence. SIAM J. Optim. 19(3), 1107–1130 (2008)
Hoffman, A.: On approximate solutions of systems of linear inequalities. J. Res. Natl. Bur. Stand. 49, 263–265 (1952)
Hong, M., Wang, X., Razaviyayn, M., Luo, Z.-Q.: Iteration complexity analysis of block coordinate descent methods. Math. Program. Ser. A 163, 85–114 (2017)
Karimi, H., Nutini, J., Schmidt, M.: Linear convergence of proximal-gradient methods under the Polyak–Łojasiewicz condition. arXiv:1608.04636v1 [cs.LG] 16 Aug 2016 (2016)
Karimi H., Nutini J., Schmidt M.: Linear convergence of gradient and proximal-gradient methods under the Polyak-Łojasiewicz condition. In: Frasconi P., Landwehr N., Manco G., Vreeken J. (eds.) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2016. Lecture Notes in Computer Science, vol 9851. Springer, Cham (2016)
Karimi, S., Vavasis, S.: A unified convergence bound for conjugate gradient and accelerated gradient. arXiv:1605.003200v1 [math.OC] 11 May 2016
Kim, D., Fessler, J.: Optimized first-order methods for smooth convex minimization. Math. Program. Ser. A 159(1), 81–107 (2016)
Lai, M.J., Yin, W.T.: Augmented \(\ell _1\) and nuclear-norm models with a globally linearly convergent algorithm. SIAM J. Imaging Sci. 6(2), 1059–1091 (2013)
Li, G., Pong, T.K.: Calculus of the exponent of Kurdyka–Lojasiewicz inequality and its applications to linear convergence of first-order methods. Found. Comput. Math. 1–34. Published online: 10 Aug (2017)
Li, X., Zhao, T., Arora, R., Liu, H., Hong, M.: An improved convergence analysis of cyclic block coordinate descent-type methods for strongly convex minimization. arXiv:1607.02793v1 [math.OC] 10 July 2016
Liu, H., Yue, M.-C., So, A.M.-C.: On the estimation performance and convergence rate of the generalized power method for phase synchronization. SIAM J. Optim. 27(4), 2426–2446 (2017)
Liu, J., Wright, S.J.: Asynchronous stochastic coordinate descent: parallelism and convergence properties. SIAM J. Optim. 25(1), 351–376 (2015)
Łojasiewicz, S.: Une propriété topologique des sous-ensembles analytiques réels. Colloques internationaux du CNRS 117, 87–89 (1963)
Luke, D.R., Thao, N.H., Teboulle, M.: Necessary conditions for linear convergence of picard iterations and application to alternating projections. arXiv:1704.08926v1 [math.OC] 28 Apr 2017
Luo, Z.Q., Tseng, P.: On the linear convergence of descent methods for convex essentially smooth minimization. SIAM J. Control Optim. 30(2), 408–425 (1992)
Luo, Z.Q., Tseng, P.: Error bounds and convergence analysis of feasible descent methods: a general approach. Ann. Oper. Res. 46–47(1), 157–178 (1993)
Luque, F.: Asymptotic convergence analysis of the proximal point algorithm. SIAM J. Optim. 22(2), 277–293 (1984)
Ma, C.X., Gudapati, N.V., Jahani, M., Tappenden, R., Takác̆, M.: Underestimate sequences via quadratic averaging. arXiv:1710.03695 (2017)
Necoara, I., Nesterov, Y., Glineur, F.: Linear convergence of first order methods for non-strongly convex optimization. Math. Program. Ser. A. Published online: 22 Jan (2018)
Nesterov, Y.: Introductory Lectures on Convex Optimization: A Basic Course. Kluwer Academic Publishers, Dordrecht (2004)
Nocedal, J., Wright, S.J.: Numerical Optimization. Springer, New York (1997)
Polyak, B.T.: Gradient methods for the minimisation of functionals. USSR Comput. Math. Math. Phys. 3(4), 864–878 (1963)
Robinson, S.M.: Some continuity properties of polyhedral multifunctions. Math. Program. Study 14, 206–214 (1981)
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
Rockafellar, R.T., Wets, R.J.B.: Variational Analysis. Springer, Berlin (1998)
Ruszczyński, A.P.: Nonlinear Optimization, vol. 13. Princeton University Press, Princeton (2006)
Schöpfer, F.: Linear convergence of descent methods for the unconstrained minimization of restricted strongly convex functions. SIAM J. Optim. 26(3), 1883–1911 (2016)
Shefi, R., Teboulle, M.: On the rate of convergence of the proximal alternating linearized minimization algorithm for convex problems. EURO J. Comput. Optim. 4, 27–46 (2016)
So, M.C.: Non-asymptotic convergence analysis of inexact gradient methods for machine learning without strong convexity. arXiv:1309.0113v1 [math.OC] 31 Aug 2013
Su, W., Boyd, S., Candés, E.J.: A differential equation for modeling Nesterovs accelerated gradient method: theory and insights. J. Mach. Learn. Res. 17, 1–13 (2016)
Taylor, A.B., Hendrickx, J.M., Glineur, F.: Exact worst-case convergence rates of the proximal gradient method for composite convex minimization. J Optim. Theory Appl. 178(2), 455–476 (2018)
Taylor, A.B., Hendrickx, J.M., Glineur, F.: Smooth strongly convex interpolation and exact worst-case performance of first-order methods. Math. Program. Ser. A 161(1–2), 307–345 (2017)
Tu, S., Boczar, R., Simchowitz, M., Recht, B.: Low-rank solutions of linear matrix equations via procrustes flow. arXiv:1507.03566v2 [math.OC] 5 Feb 2016
Wang, P.W., Lin, C.J.: Iteration complexity of feasible descent methods for convex optimization. J. Mach. Learn. Res. 15(2), 1523–1548 (2014)
Wilson, A.C., Recht, B., Jordan, M.I.: A Lyapunov analysis of momentum methods in optimization. arXiv:1611.02635v1 [math.OC] 8 Nov 2016
Xiao, L., Zhang, T.: A proximal-gradient homotopy method for the sparse least-squares problem. SIAM J. Optim. 23(2), 1062–1091 (2013)
Yang, L., Arora, R., Braverman, V., Zhao, T.: The physical systems behind optimization algorithms. arXiv:1612.02803v1 [cs.LG] 8 Dec 2016
Yun, C., Sra, S., Jadbabaie, A.: Global optimality conditions for deep neural networks. arXiv:1707.02444v1 [cs.LG] 8 July 2017
Zhang, H.: New analysis of linear convergence of gradient-type methods via unifying error bound conditions. arXiv:1606.00269v3 [math.OC] 17 Aug 2016
Zhang, H.: The restricted strong convexity revisited: analysis of equivalence to error bound and quadratic growth. Optim. Lett. 11(4), 817–833 (2017)
Zhang, H., Cheng, L.: Restricted strong convexity and its applications to convergence analysis of gradient-type methods. Optim. Lett. 9(5), 961–979 (2015)
Zhang, H., Cheng, L.: Projected shrinkage algorithm for box-constrained \(\ell _1\)-minimization. Optim. Lett. 11(1), 55–70 (2017)
Zhang, H., Cheng, L., Yin, W.T.: A dual algorithm for a class of augmented convex signal recovery models. Commun. Math. Sci. 13(1), 103–112 (2015)
Zhang, H., Yin, W.T.: Gradient methods for convex minimization: better rates under weaker conditions. Technical report, CAM Report 13-17, UCLA (2013)
Zhou, Y., Liang, Y.: Characterization of gradient dominance and regularity conditions for neural networks. arXiv:1710.06910 (2017)
Zhou, Z., So, M.C.: A unified approach to error bounds for structured convex optimization problems. Math. Program. Ser. A 165(2), 689–728 (2017)
Zolezzi, T.: On equiwellset minimum problems. Appl. Math. Optim. 4, 203–223 (1978)
Acknowledgements
I am grateful to the anonymous referees, the associate editor, and the coeditor Prof. Adrian S. Lewis for many useful comments, which allowed me to significantly improve the original presentation. I would like to thank Prof. Zaiwen Wen for his invitation and hospitality during my visit to BeiJing International Center for Mathematical Research, and to thank Prof. Dmitriy Drusvyatskiy for a careful reading of an early draft of this manuscript, and for valuable comments and suggestions. I also thank Profs. Chao Ding, Bin Dong, Lei Guo, Yongjin Liu, Deren Han, Mark Schmidt, Anthony Man-Cho So, and Wotao Yin for their time and many helpful discussions with me. Further thanks due to my cousin Boya Ouyang who helped me with my English writing, and to PhD students Ke Guo, Wei Peng, Ziyang Yuan, Xiaoya Zhang, who looked over the manuscript and corrected several typos. While visiting Chinese Academy of Sciences, I was particularly fortunate to be acquainted with Prof. Florian Jarre, who carefully read and polished this paper. This work is supported by the National Science Foundation of China (Nos. 11501569 and 61571008).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, H. New analysis of linear convergence of gradient-type methods via unifying error bound conditions. Math. Program. 180, 371–416 (2020). https://doi.org/10.1007/s10107-018-01360-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10107-018-01360-1
Keywords
- Residual measure operator
- Error bound
- Gradient descent
- Linear convergence
- Proximal point algorithm
- Forward–backward splitting algorithm
- Proximal alternating linearized minimization
- Nesterov’s acceleration
- Dual objective function