Skip to main content
Log in

Accelerated proximal point method for maximally monotone operators

  • Full Length Paper
  • Series A
  • Published:
Mathematical Programming Submit manuscript

Abstract

This paper proposes an accelerated proximal point method for maximally monotone operators. The proof is computer-assisted via the performance estimation problem approach. The proximal point method includes various well-known convex optimization methods, such as the proximal method of multipliers and the alternating direction method of multipliers, and thus the proposed acceleration has wide applications. Numerical experiments are presented to demonstrate the accelerating behaviors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. The convergence of the fixed-point residual does not guarantee the convergence of the sequence of the iterates \(\{{\varvec{x}} _i\}\). We leave analyzing the convergence of the sequence as future work, possibly based on the convergence analysis in [10] for Nesterov’s fast gradient method [51, 52] and FISTA [8] in convex minimization.

  2. A convex–concave function \(\phi \) satisfies \(\phi ({\varvec{u}} _*,{\varvec{v}} _N) \ge \phi ({\varvec{u}} _N,{\varvec{v}} _N) + \mathop {\langle {\varvec{u}} _* - {\varvec{u}} _N,\, {\varvec{q}} _{{\varvec{u}},N} \rangle }\nolimits \) for \({\varvec{q}} _{{\varvec{u}},N} \in \partial _{{\varvec{u}}}\phi ({\varvec{u}} _N,{\varvec{v}} _N)\) and \(-\phi ({\varvec{u}} _N,{\varvec{v}} _*) \ge -\phi ({\varvec{u}} _N,{\varvec{v}} _N) + \mathop {\langle {\varvec{v}} _* - {\varvec{v}} _N,\, -{\varvec{q}} _{{\varvec{v}},N} \rangle }\nolimits \) for \(-{\varvec{q}} _{{\varvec{v}},N} \in \partial _{{\varvec{v}}} (-\phi ({\varvec{u}} _N,{\varvec{v}} _N))\). Adding these two inequalities yields \(\mathop {\langle {\varvec{x}} _N - {\varvec{x}} _*,\, {\varvec{q}} _N \rangle }\nolimits \ge \phi ({\varvec{u}} _N,{\varvec{v}} _*) - \phi ({\varvec{u}} _*,{\varvec{v}} _N)\), where \({\varvec{x}} _N {:}{=} ({\varvec{u}} _N,{\varvec{v}} _N)\) and \({\varvec{q}} _N {:}{=} ({\varvec{q}} _{{\varvec{u}},N},-{\varvec{q}} _{{\varvec{v}},N})\).

  3. We found that adaptively restarting the method when the fixed-point residual increases seems to be a good option in practice.

  4. Since \(\nabla f\) is Lipschitz continuous, the operator \({\varvec{M}} _1 = -{\varvec{D}} \partial f^*(-{\varvec{D}} ^\top \cdot )\) in (53) for the problem (62) is strongly monotone, but this is insufficient to guarantee a strong monotonicity of \({\varvec{M}} _{\rho ,{\varvec{M}} _1,{\varvec{M}} _2}\) (46) for the problem (62).

References

  1. Alvarez, F., Attouch, H.: An inertial proximal method for maximal monotone operators via discretization of a nonlinear oscillator with damping. Set-Valued Anal. 9(1–2), 3–11 (2001). https://doi.org/10.1023/A:1011253113155

    Article  MathSciNet  MATH  Google Scholar 

  2. Attouch, H., Cabot, A.: Convergence of a relaxed inertial forward-backward algorithm for structured monotone inclusions. Appl. Math. Optim. 80(3), 547–598 (2019). https://doi.org/10.1007/s00245-019-09584-z

    Article  MathSciNet  MATH  Google Scholar 

  3. Attouch, H., Cabot, A.: Convergence of a relaxed inertial proximal algorithm for maximally monotone operators. Math. Program. 184(1–2), 243–287 (2020). https://doi.org/10.1007/s10107-019-01412-0

    Article  MathSciNet  MATH  Google Scholar 

  4. Attouch, H., Chbani, Z., Fadili, J., Riahi, H.: First-order optimization algorithms via inertial systems with Hessian driven damping. Math. Program. (2020). https://doi.org/10.1007/s10107-020-01591-1

    Article  Google Scholar 

  5. Attouch, H., Chbani, Z., Riahi, H.: Fast proximal methods via time scaling of damped inertial dynamics. SIAM J. Optim. 29(3), 2227–2256 (2019). https://doi.org/10.1137/18M1230207

    Article  MathSciNet  MATH  Google Scholar 

  6. Attouch, H., Peypouquet, J.: Convergence of inertial dynamics and proximal algorithms governed by maximally monotone operators. Math. Program. 174(1–2), 391–432 (2019). https://doi.org/10.1007/s10107-018-1252-x

    Article  MathSciNet  MATH  Google Scholar 

  7. Bauschke, H.H., Combettes, P.L.: Convex analysis and monotone operator theory in Hilbert spaces. Springer, Berlin (2011). https://doi.org/10.1007/978-1-4419-9467-7

    Book  MATH  Google Scholar 

  8. Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009). https://doi.org/10.1137/080716542

    Article  MathSciNet  MATH  Google Scholar 

  9. Brezis, H., Lions, P.L.: Produits infinis de resolvantes. Isr. J. Math. 29(4), 329–345 (1978). https://doi.org/10.1007/BF02761171

    Article  MATH  Google Scholar 

  10. Chambolle, A., Dossal, C.: On the convergence of the iterates of the “Fast iterative shrinkage/thresholding algorithm”. J. Optim. Theory Appl. 166(3), 968–82 (2015). https://doi.org/10.1007/s10957-015-0746-4

    Article  MathSciNet  MATH  Google Scholar 

  11. Chambolle, A., Pock, T.: A first-order primal–dual algorithm for convex problems with applications to imaging. J. Math. Imaging Vis. 40(1), 120–145 (2011). https://doi.org/10.1007/s10851-010-0251-1

    Article  MathSciNet  MATH  Google Scholar 

  12. Chambolle, A., Pock, T.: An introduction to continuous optimization for imaging. Acta Numer. 25, 161–319 (2016). https://doi.org/10.1017/S096249291600009X

    Article  MathSciNet  MATH  Google Scholar 

  13. Chambolle, A., Pock, T.: On the ergodic convergence rates of a first-order primal-dual algorithm. Math. Program. 159(1), 253–87 (2016). https://doi.org/10.1007/s10107-015-0957-3

    Article  MathSciNet  MATH  Google Scholar 

  14. Combettes, P.L.: Monotone operator theory in convex optimization. Math. Program. 170(1), 177–206 (2018)

    Article  MathSciNet  Google Scholar 

  15. Corman, E., Yuan, X.: A generalized proximal point algorithm and its convergence rate. SIAM J. Optim. 24(4), 1614–38 (2014). https://doi.org/10.1137/130940402

    Article  MathSciNet  MATH  Google Scholar 

  16. CVX Research Inc.: CVX: Matlab software for disciplined convex programming, version 2.0. http://cvxr.com/cvx (2012)

  17. Davis, D., Yin, W.: Convergence rate analysis of several splitting schemes. In: Glowinski, R., Osher, S., Yin, W. (eds.) Splitting Methods in Communication, Imaging, Science, and Engineering. Springer, Berlin (2016)

    Google Scholar 

  18. Davis, D., Yin, W.: Faster convergence rates of relaxed Peaceman-Rachford and ADMM under regularity assumptions. Math. Oper. Res. 42(3), 783–805 (2017). https://doi.org/10.1287/moor.2016.0827

    Article  MathSciNet  MATH  Google Scholar 

  19. Douglas, J., Rachford, H.H.: On the numerical solution of heat conduction problems in two and three space variables. Trans. Am. Math. Soc. 82(2), 421–39 (1956)

    Article  MathSciNet  Google Scholar 

  20. Drori, Y., Taylor, A.B.: Efficient first-order methods for convex minimization: a constructive approach. Math. Program. 184(1–2), 183–220 (2020). https://doi.org/10.1007/s10107-019-01410-2

    Article  MathSciNet  MATH  Google Scholar 

  21. Drori, Y., Teboulle, M.: Performance of first-order methods for smooth convex minimization: a novel approach. Math. Program. 145(1–2), 451–82 (2014). https://doi.org/10.1007/s10107-013-0653-0

    Article  MathSciNet  MATH  Google Scholar 

  22. Drori, Y., Teboulle, M.: An optimal variant of Kelley’s cutting-plane method. Math. Program. 160(1), 321–51 (2016). https://doi.org/10.1007/s10107-016-0985-7

    Article  MathSciNet  MATH  Google Scholar 

  23. Eckstein, J.: The Lions–Mercier splitting algorithm and the alternating direction method are instances of the proximal point method (1988). Technical report LIDS-P-1769

  24. Eckstein, J., Bertsekas, D.P.: On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program. 55(1–3), 293–318 (1992). https://doi.org/10.1007/BF01581204

    Article  MathSciNet  MATH  Google Scholar 

  25. Esser, E., Zhang, X., Chan, T.: A general framework for a class of first order primal-dual algorithms for convex optimization in imaging science. SIAM J. Imaging Sci. 3(4), 1015–46 (2010). https://doi.org/10.1137/09076934X

    Article  MathSciNet  MATH  Google Scholar 

  26. Gabay, D., Mercier, B.: A dual algorithm for the solution of nonlinear variational problems via finite-element approximations. Comput. Math. Appl. 2(1), 17–40 (1976). https://doi.org/10.1016/0898-1221(76)90003-1

    Article  MATH  Google Scholar 

  27. Glowinski, R., Marrocco, A.: Sur lapproximation par elements nis dordre un, et la resolution par penalisation-dualite dune classe de problemes de dirichlet nonlineaires, rev. francaise daut. Inf. Rech. Oper. R–2, 41–76 (1975)

    Google Scholar 

  28. Goldstein, T., O’Donoghue, B., Setzer, S., Baraniuk, R.: Fast alternating direction optimization methods. SIAM J. Imaging Sci. 7(3), 1588–623 (2014). https://doi.org/10.1137/120896219

    Article  MathSciNet  MATH  Google Scholar 

  29. Gol’shtein, E.G., Tret’yakov, N.V.: Modified Lagrangians in convex programming and their generalizations. In: Huard, P. (ed.) Point-to-Set Maps and Mathematical Programming, Mathematical Programming Studies 10. Springer, Berlin (1979)

    Google Scholar 

  30. Grant, M., Boyd, S.: Graph implementations for nonsmooth convex programs. In: Blondel, V., Boyd, S., Kimura, H. (eds.) Recent Advances in Learning and Control, Lecture Notes in Control and Information Sciences, pp. 95–110. Springer, Berlin (2008)

    Google Scholar 

  31. Gu, G., Yang, J.: On the optimal ergodic sublinear convergence rate of the relaxed proximal point algorithm for variational inequalities (2019). arXiv:1905.06030

  32. Gu, G., Yang, J.: Tight sublinear convergence rate of the proximal point algorithm for maximal monotone inclusion problems. SIAM J. Optim. 30(3), 1905–1921 (2020). https://doi.org/10.1137/19M1299049

    Article  MathSciNet  MATH  Google Scholar 

  33. Güler, O.: On the convergence of the proximal point algorithm for convex minimization. SIAM J. Control Optim. 29(2), 403–19 (1991). https://doi.org/10.1137/0329022

    Article  MathSciNet  MATH  Google Scholar 

  34. Güler, O.: New proximal point algorithms for convex minimization. SIAM J. Optim. 2(4), 649–64 (1992). https://doi.org/10.1137/0802032

    Article  MathSciNet  MATH  Google Scholar 

  35. He, B., Yuan, X.: Convergence analysis of primal-dual algorithms for a saddle-point problem: from contraction perspective. SIAM J. Imaging Sci. 5(1), 119–49 (2012). https://doi.org/10.1137/100814494

    Article  MathSciNet  MATH  Google Scholar 

  36. Hestenes, M.R.: Multiplier and gradient methods. J. Optim. Theory Appl. 4(5), 303–20 (1969). https://doi.org/10.1007/BF00927673

    Article  MathSciNet  MATH  Google Scholar 

  37. Kim, D., Fessler, J.A.: Optimized first-order methods for smooth convex minimization. Math. Program. 159(1), 81–107 (2016). https://doi.org/10.1007/s10107-015-0949-3

    Article  MathSciNet  MATH  Google Scholar 

  38. Kim, D., Fessler, J.A.: Another look at the Fast Iterative Shrinkage/Thresholding Algorithm (FISTA). SIAM J. Optim. 28(1), 223–50 (2018). https://doi.org/10.1137/16M108940X

    Article  MathSciNet  MATH  Google Scholar 

  39. Kim, D., Fessler, J.A.: Generalizing the optimized gradient method for smooth convex minimization. SIAM J. Optim. 28(2), 1920–50 (2018). https://doi.org/10.1137/17m112124x

    Article  MathSciNet  MATH  Google Scholar 

  40. Kim, D., Fessler, J.A.: Optimizing the efficiency of first-order methods for decreasing the gradient of smooth convex functions. J. Optim. Theory Appl. (2020). https://doi.org/10.1007/s10957-020-01770-2

    Article  MATH  Google Scholar 

  41. Lieder, F.: On the convergence rate of the Halpern-iteration. Optim. Lett. 15, 405–18 (2020). https://doi.org/10.1007/s11590-020-01617-9

    Article  MathSciNet  MATH  Google Scholar 

  42. Lieder, F.: Projection based methods for conic linear programming—optimal first order complexities and norm constrained quasi newton methods (Doctoral dissertation, Universitäts-und Landesbibliothek der Heinrich-Heine-Universität Düsseldorf) (2018). https://docserv.uniduesseldorf.de/servlets/DerivateServlet/Derivate-49971/Dissertation.pdf

  43. Lin, H., Mairal, J., Harchaoui, Z.: Catalyst acceleration for first-order convex optimization: from theory to practice. J. Mach. Learn. Res. 18(212), 1–54 (2018)

    MathSciNet  MATH  Google Scholar 

  44. Lions, P.L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16(6), 964–79 (1979). https://doi.org/10.1137/0716071

    Article  MathSciNet  MATH  Google Scholar 

  45. Lorenz, D., Pock, T.: An inertial forward–backward algorithm for monotone inclusions. J. Math. Imaging Vis. 51(2), 311–25 (2015). https://doi.org/10.1007/s10851-014-0523-2

    Article  MathSciNet  MATH  Google Scholar 

  46. Martinet, B.: Régularisation d’inéquations variationnelles par approximations successives. Rev. Française Informat. Recherche Opérationnelle 4, 154–8 (1970)

    MathSciNet  MATH  Google Scholar 

  47. Minty, G.J.: Monotone (nonlinear) operators in Hilbert space. Duke Math. J. 29(3), 341–6 (1962). https://doi.org/10.1215/S0012-7094-62-02933-2

    Article  MathSciNet  MATH  Google Scholar 

  48. Minty, G.J.: On the monotonicity of the gradient of a convex function. Pac. J. Math. 14, 243–7 (1964)

    Article  MathSciNet  Google Scholar 

  49. Moreau, J.J.: Proximité et dualité dans un espace hilbertien. Bulletin de la Société Mathématique de France 93, 273–99 (1965)

    Article  MathSciNet  Google Scholar 

  50. Nemirovski, A.: Efficient methods in convex programming (1994). http://www2.isye.gatech.edu/~nemirovs/Lect_EMCO.pdf. visited on 05/2019

  51. Nesterov, Y.: A method for unconstrained convex minimization problem with the rate of convergence \(O(1/k^2)\). Dokl. Akad. Nauk. USSR 269(3), 543–7 (1983)

    Google Scholar 

  52. Nesterov, Y.: On an approach to the construction of optimal methods of minimization of smooth convex functions. Ekonomika i Mateaticheskie Metody 24, 509–517 (1988). in Russian

    MATH  Google Scholar 

  53. Nesterov, Y.: Gradient methods for minimizing composite functions. Math. Program. 140(1), 125–61 (2013). https://doi.org/10.1007/s10107-012-0629-5

    Article  MathSciNet  MATH  Google Scholar 

  54. O’Donoghue, B., Candes, E.: Adaptive restart for accelerated gradient schemes. Found. Comput. Math. 15(3), 715–32 (2015). https://doi.org/10.1007/s10208-013-9150-3

    Article  MathSciNet  MATH  Google Scholar 

  55. Polyak, B.T.: Some methods of speeding up the convergence of iteration methods. USSR Comput. Math. Math. Phys. 4(5), 1–17 (1964). https://doi.org/10.1016/0041-5553(64)90137-5

    Article  Google Scholar 

  56. Powell, M.J.D.: A method for nonlinear constraints in minimization problems. In: Fletcher, R. (ed.) Optimization, pp. 283–298. Academic Press, New York (1969)

    Google Scholar 

  57. Rockafellar, R.T.: Monotone operators associated with saddle functions and minimax problems. In: Browder, F.E. (ed.) Nonlinear Functional Analysis, Part 1, vol. 18, pp. 397–407. American Mathematical Society, New York (1970)

    Google Scholar 

  58. Rockafellar, R.T.: Augmented Lagrangians and applications of the proximal point algorithm in convex programming. Math. Oper. Res. 1(2), 97–116 (1976). https://doi.org/10.1287/moor.1.2.97

    Article  MathSciNet  MATH  Google Scholar 

  59. Rockafellar, R.T.: Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 14(5), 877–98 (1976). https://doi.org/10.1137/0314056

    Article  MathSciNet  MATH  Google Scholar 

  60. Ryu, E.K., Boyd, S.: A primer on monotone operator methods. Appl. Comput. Math. 15(1), 3–43 (2016)

    MathSciNet  MATH  Google Scholar 

  61. Ryu, E.K., Taylor, A.B., Bergeling, C., Giselsson, P.: Operator splitting performance estimation: tight contraction factors and optimal parameter selection. SIAM J. Optim. 30(3), 2251–2271 (2020). https://doi.org/10.1137/19M1304854

    Article  MathSciNet  MATH  Google Scholar 

  62. Ryu, E.K., Yin, W.: Large-scale convex optimization via monotone operators (2020). https://large-scale-book.mathopt.com/LSCOMO.pdf. visited on 03/2021

  63. Shi, B., Du, S.S., Jordan, M.I., Su, W.J.: Understanding the acceleration phenomenon via high-resolution differential equations (2018). arXiv:1810.08907

  64. Su, W., Boyd, S., Candes, E.J.: A differential equation for modeling Nesterov’s accelerated gradient method: theory and insights. J. Mach. Learn. Res. 17(153), 1–43 (2016)

    MathSciNet  MATH  Google Scholar 

  65. Taylor, A.B., Bach, F.: Stochastic first-order methods: non-asymptotic and computer-aided analyses via potential functions. In: Proceedings of the Conference on Learning Theory, pp. 2934–2992 (2019)

  66. Taylor, A.B., Hendrickx, J.M., Glineur, F.: Exact worst-case performance of first-order methods for composite convex optimization. SIAM J. Optim. 27(3), 1283–313 (2017). https://doi.org/10.1137/16m108104x

    Article  MathSciNet  MATH  Google Scholar 

  67. Taylor, A.B., Hendrickx, J.M., Glineur, F.: Smooth strongly convex interpolation and exact worst-case performance of first-order methods. Math. Program. 161(1), 307–45 (2017). https://doi.org/10.1007/s10107-016-1009-3

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The author sincerely appreciates the useful comments by the associate editor and anonymous referees. The author also would like to thank Dr. Felix Lieder, who brought to attention his Ph.D. thesis [42] and his paper [41], after the acceptance of this paper, which optimized the step coefficients of the Krasnosel’skii-Mann iteration for a nonexpansive operator \({\varvec{T}} \), similarly using the PEP approach. The form of the resulting optimized method differs from that of the accelerated proximal point method proposed in this paper, but [62] recently showed that they are equivalent in the sense that they generate the same sequence, when \({\varvec{T}} = 2{\varvec{J}} _{\lambda {\varvec{M}}} - {\varvec{I}} \).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Donghwan Kim.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported in part by the National Research Foundation of Korea (NRF) Grant funded by the Korea Government (MSIT) (No. 2019R1A5A1028324), and the POSCO Science Fellowship of POSCO TJ Park Foundation.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, D. Accelerated proximal point method for maximally monotone operators. Math. Program. 190, 57–87 (2021). https://doi.org/10.1007/s10107-021-01643-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10107-021-01643-0

Keywords

Mathematics Subject Classification

Navigation