Skip to main content
Log in

New analysis of linear convergence of gradient-type methods via unifying error bound conditions

  • Full Length Paper
  • Series A
  • Published:
Mathematical Programming Submit manuscript

Abstract

This paper reveals that a common and central role, played in many error bound (EB) conditions and a variety of gradient-type methods, is a residual measure operator. On one hand, by linking this operator with other optimality measures, we define a group of abstract EB conditions, and then analyze the interplay between them; on the other hand, by using this operator as an ascent direction, we propose an abstract gradient-type method, and then derive EB conditions that are necessary and sufficient for its linear convergence. The former provides a unified framework that not only allows us to find new connections between many existing EB conditions, but also paves a way to construct new ones. The latter allows us to claim the weakest conditions guaranteeing linear convergence for a number of fundamental algorithms, including the gradient method, the proximal point algorithm, and the forward–backward splitting algorithm. In addition, we show linear convergence for the proximal alternating linearized minimization algorithm under a group of equivalent EB conditions, which are strictly weaker than the traditional strongly convex condition. Moreover, by defining a new EB condition, we show Q-linear convergence of Nesterov’s accelerated forward–backward algorithm without strong convexity. Finally, we verify EB conditions for a class of dual objective functions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Artacho, F.J.A., Geoffroy, M.H.: Metric subregularity of the convex subdifferential in Banach spaces. J. Nonlinear Convex A. 15(15), 35–47 (2015)

    MathSciNet  MATH  Google Scholar 

  2. Attouch, H., Bolte, J.: On the convergence of the proximal algorithm for nonsmooth functions involving analytic features. Math. Program. Ser. B 116, 5–16 (2009)

    MathSciNet  MATH  Google Scholar 

  3. Attouch, H., Bolte, J., Svaiter, B.F.: Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward–backward splitting, and regularized Gauss–Seidel methods. Math. Program. Ser. A 137(1–2), 91–129 (2013)

    MathSciNet  MATH  Google Scholar 

  4. Attouch, H., Cabot, A.: Convergence rates of inertial forward–backward algorithms. SIAM J. Optim. 28(1), 849–874 (2018)

    MathSciNet  MATH  Google Scholar 

  5. Attouch, H., Peypouquet, J.: The rate of convergence of Nesterov’s accelerated forward–backard method is actually \(o(k^{-2})\). SIAM J. Optim. 26(3), 1824–1834 (2016)

    MathSciNet  MATH  Google Scholar 

  6. Banjac, G., Margellos, K., Goulart, P.J.: On the convergence of a regularized Jacobi algorithm for convex optimization. IEEE Trans. Autom. Control 63(4), 1113–1119 (2018)

    MathSciNet  MATH  Google Scholar 

  7. Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, Berlin (2011)

    MATH  Google Scholar 

  8. Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 17(4), 183–202 (2009)

    MathSciNet  MATH  Google Scholar 

  9. Bertsekas, D.P.: Convex Optimization Theory. Athena Scientific and Tsinghua University Press, Belmont and Beijing (2011)

    MATH  Google Scholar 

  10. Bolte, J., Nguyen, T.P., Peypouquet, J., Suter, B.W.: From error bounds to the complexity of first-order descent methods for convex functions. Math. Program. Ser. A 165(2), 471–507 (2017)

    MathSciNet  MATH  Google Scholar 

  11. Bolte, J., Sabach, S., Teboulle, M.: Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Program. Ser. A 146(1–2), 459–494 (2014)

    MathSciNet  MATH  Google Scholar 

  12. Brézis, H.: Opérateurs maximaux monotones et semi-groupes de contractions dans les espaces de Hilbert. North-Holland Mathematics Studies, vol. 5. North-Holland Publishing Co., Amsterdam-London (1973)

  13. Bruck, R.E.: Asymptotic convergence of nonlinear contraction semigroups in Hilbert space. J. Funct. Anal. 18(1), 15–26 (1975)

    MathSciNet  MATH  Google Scholar 

  14. Candés, E.J., Li, X., Soltanolkotabi, M.: Phase retrieval via wirtinger flow: theory and algorithms. IEEE Trans. Inf. Theory 61(4), 1985–2007 (2015)

    MathSciNet  MATH  Google Scholar 

  15. Chambolle, A., Pock, T.: An introduction to continuous optimization for imaging. Acta Numer. 25, 161–319 (2016)

    MathSciNet  MATH  Google Scholar 

  16. Cruz, J., Li, G., Nghia, T.: On the Q-linear convergence of forward–backward splitting method and uniqueness of optimal solution to Lasso. Researchgate, November. arXiv:1806.06333v1 [math.OC] (2018)

  17. Dontchev, A.L., Rockafellar, R.T.: Implicit Functions and Solution Mappings. Springer, Berlin (2009)

    MATH  Google Scholar 

  18. Drori, Y., Teboulle, M.: Smooth strongly convex interpolation and exact worst-case performance of first-order methods. Math. Program. Ser. A 145(1–2), 451–482 (2014)

    MATH  Google Scholar 

  19. Drusvyatskiy, D., Kempton, C.: An accelerated algorithm for minimizaing convex compositions. arXiv:1605.00125v1 [math.OC] 30 Apr 2016

  20. Drusvyatskiy, D., Lewis, A.S.: Error bounds, quadratic growth, and linear convergence of proximal methods. Math. Oper. Res. Published online: 15 Mar (2018)

  21. Drusvyatskiy, D., Mordukhovich, B.S., Nghia, T.T.A.: Second-order growth, tilt stability, and metric regularity of the subdifferential. J. Convex Anal. 21(4), 1165–1192 (2014)

    MathSciNet  MATH  Google Scholar 

  22. Garrigos, G., Rosasco, L., Villa, S.: Convergence of the forward–backward algorithm: beyond the worst-case with the help of geometry. arXiv:1703.09477v2 [math.OC] 29 Mar 2017

  23. Gong, P., Ye, J.: Linear convergence of variance-reduced projected stochastic gradient without strong convexity. arXiv:1406.1102v2 [cs.NA] 10 July 2015

  24. Hale, E.T., Yin, W.T., Zhang, Y.: Fixed point continuation for \(\ell _1\)-minimization: methodology and convergence. SIAM J. Optim. 19(3), 1107–1130 (2008)

    MathSciNet  MATH  Google Scholar 

  25. Hoffman, A.: On approximate solutions of systems of linear inequalities. J. Res. Natl. Bur. Stand. 49, 263–265 (1952)

    MathSciNet  Google Scholar 

  26. Hong, M., Wang, X., Razaviyayn, M., Luo, Z.-Q.: Iteration complexity analysis of block coordinate descent methods. Math. Program. Ser. A 163, 85–114 (2017)

    MathSciNet  MATH  Google Scholar 

  27. Karimi, H., Nutini, J., Schmidt, M.: Linear convergence of proximal-gradient methods under the Polyak–Łojasiewicz condition. arXiv:1608.04636v1 [cs.LG] 16 Aug 2016 (2016)

  28. Karimi H., Nutini J., Schmidt M.: Linear convergence of gradient and proximal-gradient methods under the Polyak-Łojasiewicz condition. In: Frasconi P., Landwehr N., Manco G., Vreeken J. (eds.) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2016. Lecture Notes in Computer Science, vol 9851. Springer, Cham (2016)

    Google Scholar 

  29. Karimi, S., Vavasis, S.: A unified convergence bound for conjugate gradient and accelerated gradient. arXiv:1605.003200v1 [math.OC] 11 May 2016

  30. Kim, D., Fessler, J.: Optimized first-order methods for smooth convex minimization. Math. Program. Ser. A 159(1), 81–107 (2016)

    MathSciNet  MATH  Google Scholar 

  31. Lai, M.J., Yin, W.T.: Augmented \(\ell _1\) and nuclear-norm models with a globally linearly convergent algorithm. SIAM J. Imaging Sci. 6(2), 1059–1091 (2013)

    MathSciNet  MATH  Google Scholar 

  32. Li, G., Pong, T.K.: Calculus of the exponent of Kurdyka–Lojasiewicz inequality and its applications to linear convergence of first-order methods. Found. Comput. Math. 1–34. Published online: 10 Aug (2017)

  33. Li, X., Zhao, T., Arora, R., Liu, H., Hong, M.: An improved convergence analysis of cyclic block coordinate descent-type methods for strongly convex minimization. arXiv:1607.02793v1 [math.OC] 10 July 2016

  34. Liu, H., Yue, M.-C., So, A.M.-C.: On the estimation performance and convergence rate of the generalized power method for phase synchronization. SIAM J. Optim. 27(4), 2426–2446 (2017)

    MathSciNet  MATH  Google Scholar 

  35. Liu, J., Wright, S.J.: Asynchronous stochastic coordinate descent: parallelism and convergence properties. SIAM J. Optim. 25(1), 351–376 (2015)

    MathSciNet  MATH  Google Scholar 

  36. Łojasiewicz, S.: Une propriété topologique des sous-ensembles analytiques réels. Colloques internationaux du CNRS 117, 87–89 (1963)

    MATH  Google Scholar 

  37. Luke, D.R., Thao, N.H., Teboulle, M.: Necessary conditions for linear convergence of picard iterations and application to alternating projections. arXiv:1704.08926v1 [math.OC] 28 Apr 2017

  38. Luo, Z.Q., Tseng, P.: On the linear convergence of descent methods for convex essentially smooth minimization. SIAM J. Control Optim. 30(2), 408–425 (1992)

    MathSciNet  MATH  Google Scholar 

  39. Luo, Z.Q., Tseng, P.: Error bounds and convergence analysis of feasible descent methods: a general approach. Ann. Oper. Res. 46–47(1), 157–178 (1993)

    MathSciNet  MATH  Google Scholar 

  40. Luque, F.: Asymptotic convergence analysis of the proximal point algorithm. SIAM J. Optim. 22(2), 277–293 (1984)

    MathSciNet  MATH  Google Scholar 

  41. Ma, C.X., Gudapati, N.V., Jahani, M., Tappenden, R., Takác̆, M.: Underestimate sequences via quadratic averaging. arXiv:1710.03695 (2017)

  42. Necoara, I., Nesterov, Y., Glineur, F.: Linear convergence of first order methods for non-strongly convex optimization. Math. Program. Ser. A. Published online: 22 Jan (2018)

  43. Nesterov, Y.: Introductory Lectures on Convex Optimization: A Basic Course. Kluwer Academic Publishers, Dordrecht (2004)

    MATH  Google Scholar 

  44. Nocedal, J., Wright, S.J.: Numerical Optimization. Springer, New York (1997)

    MATH  Google Scholar 

  45. Polyak, B.T.: Gradient methods for the minimisation of functionals. USSR Comput. Math. Math. Phys. 3(4), 864–878 (1963)

    MATH  Google Scholar 

  46. Robinson, S.M.: Some continuity properties of polyhedral multifunctions. Math. Program. Study 14, 206–214 (1981)

    MathSciNet  MATH  Google Scholar 

  47. Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)

    MATH  Google Scholar 

  48. Rockafellar, R.T., Wets, R.J.B.: Variational Analysis. Springer, Berlin (1998)

    MATH  Google Scholar 

  49. Ruszczyński, A.P.: Nonlinear Optimization, vol. 13. Princeton University Press, Princeton (2006)

    MATH  Google Scholar 

  50. Schöpfer, F.: Linear convergence of descent methods for the unconstrained minimization of restricted strongly convex functions. SIAM J. Optim. 26(3), 1883–1911 (2016)

    MathSciNet  MATH  Google Scholar 

  51. Shefi, R., Teboulle, M.: On the rate of convergence of the proximal alternating linearized minimization algorithm for convex problems. EURO J. Comput. Optim. 4, 27–46 (2016)

    MathSciNet  MATH  Google Scholar 

  52. So, M.C.: Non-asymptotic convergence analysis of inexact gradient methods for machine learning without strong convexity. arXiv:1309.0113v1 [math.OC] 31 Aug 2013

  53. Su, W., Boyd, S., Candés, E.J.: A differential equation for modeling Nesterovs accelerated gradient method: theory and insights. J. Mach. Learn. Res. 17, 1–13 (2016)

    MathSciNet  MATH  Google Scholar 

  54. Taylor, A.B., Hendrickx, J.M., Glineur, F.: Exact worst-case convergence rates of the proximal gradient method for composite convex minimization. J Optim. Theory Appl. 178(2), 455–476 (2018)

    MathSciNet  MATH  Google Scholar 

  55. Taylor, A.B., Hendrickx, J.M., Glineur, F.: Smooth strongly convex interpolation and exact worst-case performance of first-order methods. Math. Program. Ser. A 161(1–2), 307–345 (2017)

    MathSciNet  MATH  Google Scholar 

  56. Tu, S., Boczar, R., Simchowitz, M., Recht, B.: Low-rank solutions of linear matrix equations via procrustes flow. arXiv:1507.03566v2 [math.OC] 5 Feb 2016

  57. Wang, P.W., Lin, C.J.: Iteration complexity of feasible descent methods for convex optimization. J. Mach. Learn. Res. 15(2), 1523–1548 (2014)

    MathSciNet  MATH  Google Scholar 

  58. Wilson, A.C., Recht, B., Jordan, M.I.: A Lyapunov analysis of momentum methods in optimization. arXiv:1611.02635v1 [math.OC] 8 Nov 2016

  59. Xiao, L., Zhang, T.: A proximal-gradient homotopy method for the sparse least-squares problem. SIAM J. Optim. 23(2), 1062–1091 (2013)

    MathSciNet  MATH  Google Scholar 

  60. Yang, L., Arora, R., Braverman, V., Zhao, T.: The physical systems behind optimization algorithms. arXiv:1612.02803v1 [cs.LG] 8 Dec 2016

  61. Yun, C., Sra, S., Jadbabaie, A.: Global optimality conditions for deep neural networks. arXiv:1707.02444v1 [cs.LG] 8 July 2017

  62. Zhang, H.: New analysis of linear convergence of gradient-type methods via unifying error bound conditions. arXiv:1606.00269v3 [math.OC] 17 Aug 2016

  63. Zhang, H.: The restricted strong convexity revisited: analysis of equivalence to error bound and quadratic growth. Optim. Lett. 11(4), 817–833 (2017)

    MathSciNet  MATH  Google Scholar 

  64. Zhang, H., Cheng, L.: Restricted strong convexity and its applications to convergence analysis of gradient-type methods. Optim. Lett. 9(5), 961–979 (2015)

    MathSciNet  MATH  Google Scholar 

  65. Zhang, H., Cheng, L.: Projected shrinkage algorithm for box-constrained \(\ell _1\)-minimization. Optim. Lett. 11(1), 55–70 (2017)

    MathSciNet  MATH  Google Scholar 

  66. Zhang, H., Cheng, L., Yin, W.T.: A dual algorithm for a class of augmented convex signal recovery models. Commun. Math. Sci. 13(1), 103–112 (2015)

    MathSciNet  MATH  Google Scholar 

  67. Zhang, H., Yin, W.T.: Gradient methods for convex minimization: better rates under weaker conditions. Technical report, CAM Report 13-17, UCLA (2013)

  68. Zhou, Y., Liang, Y.: Characterization of gradient dominance and regularity conditions for neural networks. arXiv:1710.06910 (2017)

  69. Zhou, Z., So, M.C.: A unified approach to error bounds for structured convex optimization problems. Math. Program. Ser. A 165(2), 689–728 (2017)

    MathSciNet  MATH  Google Scholar 

  70. Zolezzi, T.: On equiwellset minimum problems. Appl. Math. Optim. 4, 203–223 (1978)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

I am grateful to the anonymous referees, the associate editor, and the coeditor Prof. Adrian S. Lewis for many useful comments, which allowed me to significantly improve the original presentation. I would like to thank Prof. Zaiwen Wen for his invitation and hospitality during my visit to BeiJing International Center for Mathematical Research, and to thank Prof. Dmitriy Drusvyatskiy for a careful reading of an early draft of this manuscript, and for valuable comments and suggestions. I also thank Profs. Chao Ding, Bin Dong, Lei Guo, Yongjin Liu, Deren Han, Mark Schmidt, Anthony Man-Cho So, and Wotao Yin for their time and many helpful discussions with me. Further thanks due to my cousin Boya Ouyang who helped me with my English writing, and to PhD students Ke Guo, Wei Peng, Ziyang Yuan, Xiaoya Zhang, who looked over the manuscript and corrected several typos. While visiting Chinese Academy of Sciences, I was particularly fortunate to be acquainted with Prof. Florian Jarre, who carefully read and polished this paper. This work is supported by the National Science Foundation of China (Nos. 11501569 and 61571008).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hui Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, H. New analysis of linear convergence of gradient-type methods via unifying error bound conditions. Math. Program. 180, 371–416 (2020). https://doi.org/10.1007/s10107-018-01360-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10107-018-01360-1

Keywords

Mathematics Subject Classification

Navigation