Skip to main content
Log in

Fast gradient methods for uniformly convex and weakly smooth problems

  • Published:
Advances in Computational Mathematics Aims and scope Submit manuscript

Abstract

In this paper, acceleration of gradient methods for convex optimization problems with weak levels of convexity and smoothness is considered. Starting from the universal fast gradient method which was designed to be an optimal method for weakly smooth problems whose gradients are Hölder continuous, its momentum is modified appropriately so that it can also accommodate uniformly convex and weakly smooth problems. Different from the existing works, fast gradient methods proposed in this paper do not use the restarting technique but use momentums that are suitably designed to reflect both the uniform convexity and weak smoothness information of the target energy function. Both theoretical and numerical results that support the superiority of the proposed methods are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Availability of data and materials

All data generated or analyzed during this study are included in this published article.

Code availability

The source code used to generate all data for the current study is available from the corresponding author upon request.

References

  1. Bauschke, H.H., Combettes, P.L.: Convex analysis and monotone operator theory in Hilbert spaces. Springer, New York (2011)

  2. Nemirovskii, A.S., Nesterov, Y.E.: Optimal methods of smooth convex minimization. USSR Comput. Math. Math. Phys. 25(2), 21–30 (1985)

    Article  MathSciNet  Google Scholar 

  3. Park, J.: Additive Schwarz methods for convex optimization as gradient methods. SIAM J. Numer. Anal. 58(3), 1495–1530 (2020)

    Article  MathSciNet  Google Scholar 

  4. Roulet, V., d’Aspremont, A.: Sharpness, restart, and acceleration. SIAM J. Optim. 30(1), 262–289 (2020)

    Article  MathSciNet  Google Scholar 

  5. Bolte, J., Daniilidis, A., Lewis, A.: The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM J. Optim. 17(4), 1205–1223 (2007)

    Article  Google Scholar 

  6. Xu, Y., Yin, W.: A block coordinate descent method for regularized multiconvex optimization with applications to nonnegative tensor factorization and completion. SIAM J. Imaging Sci. 6(3), 1758–1789 (2013)

    Article  MathSciNet  Google Scholar 

  7. Nesterov, Y.: Universal gradient methods for convex optimization problems. Math. Program. 152(1-2), 381–404 (2015)

    Article  MathSciNet  Google Scholar 

  8. Chambolle, A., Pock, T.: An introduction to continuous optimization for imaging. Acta Numer. 25, 161–319 (2016)

    Article  MathSciNet  Google Scholar 

  9. Nesterov, Y.: Lectures on convex optimization. Springer, Cham (2018)

    Book  Google Scholar 

  10. Azé, D., Penot, J-P: Uniformly convex and uniformly smooth convex functions. In: Annales de la Faculté des sciences de Toulouse: Mathématiques, vol. 4, pp 705–730 (1995)

  11. Xu, Z-B, Roach, G.F.: Characteristic inequalities of uniformly convex and uniformly smooth Banach spaces. J. Math. Anal. Appl. 157(1), 189–210 (1991)

    Article  MathSciNet  Google Scholar 

  12. Ciarlet, P.G.: The finite element method for elliptic problems. SIAM, Philadelphia (2002)

    Book  Google Scholar 

  13. Bermejo, R., Infante, J.-A.: A multigrid algorithm for the p-Laplacian. SIAM J. Sci. Comput. 21(5), 1774–1789 (2000)

    Article  MathSciNet  Google Scholar 

  14. Feng, W., Salgado, A.J., Wang, C., Wise, S.M.: Preconditioned steepest descent methods for some nonlinear elliptic equations involving p-Laplacian terms. J. Comput. Phys. 334, 45–67 (2017)

    Article  MathSciNet  Google Scholar 

  15. Huang, Y.Q., Li, R., Liu, W.: Preconditioned descent algorithms for p-Laplacian. J. Sci. Comput. 32(2), 343–371 (2007)

    Article  MathSciNet  Google Scholar 

  16. Zhou, G., Feng, C.: The steepest descent algorithm without line search for p-Laplacian. Appl. Math. Comput. 224, 36–45 (2013)

    MathSciNet  MATH  Google Scholar 

  17. Nesterov, Y.E.: A method for solving the convex programming problem with convergence rate O(1/k2). Dokl. Akad. Nauk SSSR 269, 543–547 (1983)

    MathSciNet  Google Scholar 

  18. Nesterov, Y.: Smooth minimization of non-smooth functions. Math. Program. 103(1), 127–152 (2005)

    Article  MathSciNet  Google Scholar 

  19. Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)

    Article  MathSciNet  Google Scholar 

  20. Nesterov, Y.: Gradient methods for minimizing composite functions. Math. Program. 140(1), 125–161 (2013)

    Article  MathSciNet  Google Scholar 

  21. Devolder, O., Glineur, F., Nesterov, Y.: First-order methods with inexact oracle: the strongly convex case. Technical report, CORE Discussion Paper (2013)

  22. Devolder, O., Glineur, F., Nesterov, Y.: First-order methods of smooth convex optimization with inexact oracle. Math. Program. 146(1), 37–75 (2014)

    Article  MathSciNet  Google Scholar 

  23. Iouditski, A., Nesterov, Y.: Primal-dual subgradient methods for minimizing uniformly convex functions. arXiv:1401.1792 (2014)

  24. Drori, Y., Teboulle, M.: Performance of first-order methods for smooth convex minimization: a novel approach. Math. Program. 145(1), 451–482 (2014)

    Article  MathSciNet  Google Scholar 

  25. Kim, D., Fessler, J.A.: Optimized first-order methods for smooth convex minimization. Math, Program. 159(1), 81–107 (2016)

    Article  MathSciNet  Google Scholar 

  26. Kim, D., Fessler, J.A.: Generalizing the optimized gradient method for smooth convex minimization. SIAM J. Optim. 28(2), 1920–1950 (2018)

    Article  MathSciNet  Google Scholar 

  27. Calatroni, L., Chambolle, A.: Backtracking strategies for accelerated descent methods with smooth composite objectives. SIAM J. Optim. 29(3), 1772–1798 (2019)

    Article  MathSciNet  Google Scholar 

  28. ODonoghue, B., Candes, E.: Adaptive restart for accelerated gradient schemes. Found. Comput. Math. 15(3), 715–732 (2015)

    Article  MathSciNet  Google Scholar 

  29. Renegar, J., Grimmer, B.: A simple nearly optimal restart scheme for speeding up first-order methods. Found. Comput. Math. https://doi.org/10.1007/s10208-021-09502-2 (2021)

  30. Nesterov, Y., Gasnikov, A., Guminov, S., Dvurechensky, P.: Primal–dual accelerated gradient methods with small-dimensional relaxation oracle. Optim. Methods Softw. 1–38 (2020)

  31. Stonyakin, F., Tyurin, A., Gasnikov, A., Dvurechensky, P., Agafonov, A., Dvinskikh, D., Alkousa, M., Pasechnyuk, D., Artamonov, S., Piskunova, V.: Inexact model: a framework for optimization and variational inequalities. Optim. Methods Softw. 1–47 (2021)

  32. Fercoq, O., Qu, Z.: Adaptive restart of accelerated gradient methods under local quadratic growth condition. IMA J. Numer. Anal. 39(4), 2069–2095 (2019)

    Article  MathSciNet  Google Scholar 

  33. Armijo, L.: Minimization of functions having Lipschitz continuous first partial derivatives. Pacific J. Math. 16(1), 1–3 (1966)

    Article  MathSciNet  Google Scholar 

  34. Love, E.R.: Some logarithm inequalities. Math. Gaz. 64(427), 55–57 (1980)

    Article  Google Scholar 

  35. Park, J.: Pseudo-linear convergence of an additive Schwarz method for dual total variation minimization. Electron. Trans. Numer. Anal. 54, 176–197 (2021)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work was initially started with the help of Professor Donghwan Kim through meetings on acceleration schemes for first-order methods. The author would like to thank him for his insightful comments and assistance.

Funding

This work was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2019R1A6A1A10073887).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jongho Park.

Ethics declarations

Competing interests

The author declares no competing interests.

Additional information

Communicated by: Stefan Volkwein

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A. Recurrence inequalities

This appendix presents several useful recurrence inequalities used throughout this paper and their proofs. Motivated by [20, Lemma 8] and [7, Theorem 3], we state the following useful lemma.

Lemma A.1

Let {An}n≥ 1 be an increasing sequence of positive real numbers that satisfies

$$ (A_{n+1} - A_{n})^{\gamma} \geq C A_{n}A_{n+1}^{\gamma - 1}, \quad n \geq1 $$

for some 1 ≤ γ ≤ 2 and C > 0. Then we have

$$ A_{n} \geq A_{1} \left( 1+ \frac{C^{1/\gamma}}{2} \right)^{\gamma(n-1)}, \quad n \geq 1. $$

Proof

Take any n ≥ 1. Since 1/2 ≤ 1/γ ≤ 1, the following inequality holds:

$$ A_{n+1} - A_{n} \leq (A_{n+1}^{1- 1/\gamma} + A_{n}^{1- 1/\gamma} ) (A_{n+1}^{1/\gamma} - A_{n}^{1/\gamma} ) . $$

It follows that

$$ C A_{n} A_{n+1}^{\gamma - 1} \leq (A_{n+1}^{1- 1/\gamma} + A_{n}^{1- 1/\gamma} )^{\gamma} (A_{n+1}^{1/\gamma} - A_{n}^{1/\gamma} )^{\gamma} \leq 2^{\gamma} A_{n+1}^{\gamma - 1} (A_{n+1}^{1/\gamma} - A_{n}^{1/\gamma} )^{\gamma}. $$

Now, we have

$$ A_{n+1}^{1/\gamma} - A_{n}^{1/\gamma} \geq \frac{C^{1/\gamma}}{2} A_{n}^{1/\gamma} \quad \Leftrightarrow \quad A_{n+1} \geq \left( 1 + \frac{C^{1/\gamma}}{2} \right)^{\gamma} A_{n}. $$

We get the desired result by applying the above inequality recursively. □

The next lemma is useful when we prove sublinear convergence rate of some fast gradient methods. We note that the proof of Lemma A.2 closely follows [15, Lemma 1].

Lemma A.2

Let {An}n≥ 1 be an increasing sequence of positive real numbers that satisfies

$$ A_{n+1} - A_{n} \geq C A_{n}^{\gamma}, \quad n \geq 1 $$

for some 0 ≤ γ < 1 and C > 0. Then we have

$$ A_{n} \geq \min \left\{ A_{1}, \left( \frac{C}{2 (2^{\frac{1}{1-\gamma}} - 1 )} \right)^{\frac{1}{1-\gamma}} \right\} n^{\frac{1}{1-\gamma}}, \quad n \geq 1. $$

Proof

Take any n ≥ 1. Since An+ 1An, we get

$$ A_{n+1} - A_{n} \geq CA_{n}^{\gamma} \geq C A_{n} A_{n+1}^{-(1-\gamma)}, $$

or equivalently,

$$ \frac{A_{n+1}}{A_{n}} \geq 1 + C A_{n+1}^{-(1-\gamma)}. $$

Writing \(A_{n} = B_{n} n^{\frac {1}{1- \gamma }}\), we have

$$ \frac{B_{n+1}}{B_{n}} \geq \left( \frac{n}{n+1} \right)^{\frac{1}{1-\gamma}} \left( 1 + \frac{C B_{n+1}^{-(1 - \gamma)}}{n+1} \right). $$
(A.1)

The right-hand side of (A.1) is greater than or equal to 1 if and only if

$$ B_{n+1} \leq \left[\frac{n+1}{C} \left( \left( 1+ \frac{1}{n} \right)^{\frac{1}{1-\gamma}} - 1 \right) \right]^{-\frac{1}{1-\gamma}} $$
(A.2)

From the fact that the right-hand side of (A.2) increases as n increases, we deduce that a sufficient condition to satisfy Bn+ 1Bn is that

$$ B_{n+1} \leq \left( \frac{C}{2(2^{\frac{1}{1-\gamma}} - 1 )} \right)^{\frac{1}{1-\gamma}}. $$

Then it is straightforward by mathematical induction that

$$ B_{n} \geq \min \left\{ B_{1}, \left( \frac{C}{2(2^{\frac{1}{1-\gamma}} - 1 )} \right)^{\frac{1}{1-\gamma}} \right\}, \quad n \geq 1, $$

which implies the desired result. □

Appendix B. Numerical verification of Claim 4.8

This appendix is devoted to discussions on Claim 4.8. We prove a special case of Claim 4.8 and then present numerical evidences for the other cases. First, we consider the situation when p = 2, i.e., the function f in (1.5) is strongly convex.

Proposition B.1

Claim 4.8 holds when p = 2.

Proof

If p = 2, then (4.1) and (4.8) reduces to (3.6) and (3.7), respectively. Therefore, the desired result can be obtained by the same argument as Theorem 3.8. □

Proposition B.1 means that Claim 4.8 is indeed a generalization of Theorem 3.8 to the case p ≥ 2. In the remainder of this appendix, we assume that p > 2 ≥ q ≥ 1. We show that the following claim is a sufficient condition to ensure Claim 4.8.

Claim B.2

Let {An}n≥ 1 be an increasing sequence of positive real numbers that satisfies

$$ (A_{n+1} - A_{n})^{2} \geq C A_{n}^{1 - \frac{2-q}{q} \left( 1 + \frac{2(p-q)}{p(3q-2)} \right)} \sum\limits_{j=1}^{n} \frac{(A_{j} - A_{j-1})^{\frac{2}{p}}}{A_{j}^{\frac{p-2}{p} \frac{2(p-q)}{p(3q-2)}}}, \quad n \geq 1 $$
(B.1)

for some p > 2 ≥ q ≥ 1 and C > 0. Then we have

$$ A_{n} \geq \widetilde{C} n^{\frac{p(3q-2)}{2(p-q)}}, \quad n \geq 1, $$

where \(\widetilde {C}\) is a positive constant depending on p, q, C, and A1 only.

Remark B.3

Replacing An, An+ 1An, and \({\sum }_{j=1}^{n} \cdot \) in (B.1) by y(t), \(y^{\prime }(t)\), and \({{\int \limits }_{0}^{t}} \cdot ds\), respectively, one can obtain the ordinary differential equation

$$ y^{\prime}(t)^{2} = C y(t)^{1 - \frac{2-q}{q} \left( 1 + \frac{2(p-q)}{p(3q-2)} \right)} {{\int}_{0}^{t}} \frac{y^{\prime}(s)^{\frac{2}{p}}}{y(s)^{\frac{p-2}{p} \frac{2(p-q)}{p(3q-2)}}} ds, $$
(B.2)

which is a continuous analogue of (B.1). If we impose the initial condition y(0) = 0 to (B.2), then we can readily verify that (B.2) admits a solution

$$ y(t) = \widehat{C} t^{\frac{p(3q-2)}{2(p-q)}}, \quad t \geq 0, $$

where \(\widehat {C}\) is an appropriate constant depending on p, q, and C. That is, the solution of (B.2) has the same growth rate as the conclusion of Claim B.2.

Even though Claim B.2 has the rather complex structure, it is in fact a generalization of (A.2). If we set p = 2 and 1 ≤ q < 2 in (B.1), then we get

$$ (A_{n+1} - A_{n}) \geq C^{\frac{1}{2}} A_{n}^{\frac{4q-4}{3q-2}}, \quad n \geq 1, $$

which has the same form as (A.2) (note that \(0 \leq \frac {4q-4}{3q-2} < 1\) if 1 ≤ q < 2).

Proposition B.4

Assume that p > 2 ≥ q ≥ 1. Claim B.2 implies Claim 4.8.

Proof

The starting point of the proof is (4.1); (4.4) and (4.8) imply that

$$ \begin{array}{ll} (A_{n+1} - A_{n})^{2} &\geq \frac{A_{n+1} {\sum}_{j=1}^{n} \delta_{j-1}^{\frac{p-2}{p}} \mu^{\frac{2}{p}} (A_{j} - A_{j-1}) }{L_{n+1}} \\ &\geq \frac{C_{\delta}^{\frac{p-2}{p}}C_{\epsilon}^{\frac{2-q}{q}}}{2 \kappa} A_{n+1}^{1 - \frac{2-q}{q} \left( 1 + \frac{2(p-q)}{p(3q-2)} \right)} \sum\limits_{j=1}^{n} \frac{(A_{j} - A_{j-1})^{\frac{2}{p}}}{A_{j}^{\frac{p-2}{p} \frac{2(p-q)}{p(3q-2)}}} \\ &\geq \frac{C_{\delta}^{\frac{p-2}{p}}C_{\epsilon}^{\frac{2-q}{q}}}{2 \kappa} A_{n}^{1 - \frac{2-q}{q} \left( 1 + \frac{2(p-q)}{p(3q-2)} \right)} \sum\limits_{j=1}^{n} \frac{(A_{j} - A_{j-1})^{\frac{2}{p}}}{A_{j}^{\frac{p-2}{p} \frac{2(p-q)}{p(3q-2)}}} , \end{array} $$

where κ was defined in (2.3). Similarly to (3.8), one can verify that A1 has a lower bound depending on p, q, and L only. Hence, if we assume that Claim B.2 holds, then we can conclude that \(A_{n} \geq \widetilde {C}n^{\frac {p(3q-2)}{2(p-q)}}\!,\) where \(\widetilde {C}\) is a positive constant depending on p, q, L, μ, C𝜖, and Cδ. It is straightforward to prove that

$$ F(x_{n}) - F(x^{*}) \leq \frac{1}{2 \widetilde{C}n^{\frac{p(3q-2)}{2(p-q)}}} \left( \|x_{0} - x^{*} \|^{2} + \frac{C}{\widetilde{C}^{\frac{2(p-q)}{p(3q-2)}}} \left( 1 + \log n \right) \right) $$

by invoking Theorem 4.4 and using the same argument as Theorem 3.8. □

We verify Claim B.2 by numerical experiments. Figure 7 plots An and \(n^{\frac {p(3q-2)}{2(p-q)}}\) with respect to n in log-log scale, where An is generated by the recurrence relation

$$ A_{1} =1, \quad (A_{n+1} - A_{n})^{2} = A_{n}^{1 - \frac{2-q}{q} \left( 1 + \frac{2(p-q)}{p(3q-2)} \right)} \sum\limits_{j=1}^{n} \frac{(A_{j} - A_{j-1})^{\frac{2}{p}}}{A_{j}^{\frac{p-2}{p} \frac{2(p-q)}{p(3q-2)}}}, n \geq 1 $$
(B.3)

for various choices of p and q. In all cases, the slope of the graph for An is a bit greater than that of the graph for \(n^{\frac {p(3q-2)}{2(p-q)}}\). That is, the asymptotic growth rate of An is observed to be greater than \(n^{\frac {p(3q-2)}{2(p-q)}}\), which implies Claim B.2. Thanks to Proposition B.4, the validity of Claim 4.8 is ensured by the above numerical results.

Fig. 7
figure 7

Growth of An generated by the recurrence relation (B.3) and \(n^{\frac {p(3q-2)}{2(p-q)}}\) for various p and q such that p > 2 ≥ q ≥ 1

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Park, J. Fast gradient methods for uniformly convex and weakly smooth problems. Adv Comput Math 48, 34 (2022). https://doi.org/10.1007/s10444-022-09943-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10444-022-09943-5

Keywords

Mathematics Subject Classification (2010)

Navigation