Regional complexity analysis of algorithms for nonconvex smooth optimization

Curtis, Frank E.; Robinson, Daniel P.

doi:10.1007/s10107-020-01492-3

Regional complexity analysis of algorithms for nonconvex smooth optimization

Full Length Paper
Series A
Published: 01 April 2020

Volume 187, pages 579–615, (2021)
Cite this article

Mathematical Programming Submit manuscript

578 Accesses
2 Citations
Explore all metrics

Abstract

A strategy is proposed for characterizing the worst-case performance of algorithms for solving nonconvex smooth optimization problems. Contemporary analyses characterize worst-case performance by providing, under certain assumptions on an objective function, an upper bound on the number of iterations (or function or derivative evaluations) required until a pth-order stationarity condition is approximately satisfied. This arguably leads to conservative characterizations based on certain objectives rather than on ones that are typically encountered in practice. By contrast, the strategy proposed in this paper characterizes worst-case performance separately over regions comprising a search space. These regions are defined generically based on properties of derivative values. In this manner, one can analyze the worst-case performance of an algorithm independently from any particular class of objectives. Then, once given a class of objectives, one can obtain a tailored complexity analysis merely by delineating the types of regions that comprise the search spaces for functions in the class. Regions defined by first- and second-order derivatives are discussed in detail and example complexity analyses are provided for a few standard first- and second-order algorithms when employed to minimize convex and nonconvex objectives of interest. It is also explained how the strategy can be generalized to regions defined by higher-order derivatives and for analyzing the behavior of higher-order algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Worst-case evaluation complexity for unconstrained nonlinear optimization using high-order regularized models

Article 30 August 2016

A Newton-CG algorithm with complexity guarantees for smooth unconstrained optimization

Article 19 January 2019

Worst-case evaluation complexity of a derivative-free quadratic regularization method

Article 09 February 2023

Notes

For the sake of brevity, we focus on worst-case complexity in terms of upper bounds on the number of iterations required until a termination condition is satisfied, although in general one should also take function and derivative evaluation complexity into account. These can be considered in the same manner as iteration complexity in our proposed strategy.
Some authors take the term gradient-dominated to mean gradient-dominated of degree 2. We do not take this meaning since, as seen in [32] and in this paper, functions that are only gradient-dominated of degree 1 offer different and interesting results.
In this case, the decrease in the objective would be indicative of an m-step linear (for part (a)) or m-step sublinear (for part (b)) rate of convergence. We do not explicitly refer to such a multi-step aspect of a convergence rate since it is always clear from the context.
There arises an interesting scenario in this theorem for \(x_{k+1} \in \mathcal{R}_{1}^{1}\) during which \(\{f_k - f_{ref }\}\) might initially decrease at a superlinear rate. However, this should not be overstated. After all, if this scenario even occurs, then the number of iterations in which it will occur will be limited if the iterates remain at or near points in \(\mathcal{R}_{1}\).

References

Birgin, E.G., Gardenghi, J.L., Martínez, J.M., Santos, S.A., Toint, PhL: Worst-case evaluation complexity for unconstrained nonlinear optimization using high-order regularized models. Math. Program. 163(1), 359–368 (2017)
Article MathSciNet Google Scholar
Birgin, E.G., Martínez, J.M.: The use of quadratic regularization with a cubic descent condition for unconstrained optimization. SIAM J. Optim. 27(2), 1049–1074 (2017)
Article MathSciNet Google Scholar
Borgwardt, K.-H.: The average number of pivot steps required by the Simplex-Method is polynomial. Zeitschrift für Operations Research 26(1), 157–177 (1982)
MathSciNet MATH Google Scholar
Carmon, Y., Hinder, O., Duchi, J.C., Sidford, A.: “Convex until proven guilty”: dimension-free acceleration of gradient descent on non-convex functions. In: Proceedings of the International Conference on Machine Learning, PMLR Vol. 70, pp. 654–663 (2017)
Cartis, C., Gould, N.I.M., Toint, PhL: On the complexity of steepest descent, Newton’s and regularized Newton’s methods for nonconvex unconstrained optimization problems. SIAM J. Optim. 20(6), 2833–2852 (2010)
Article MathSciNet Google Scholar
Cartis, C., Gould, N .I .M., Toint, Ph L: Adaptive cubic regularisation methods for unconstrained optimization. Part I: Motivation, convergence and numerical results. Math. Program. 127, 245–295 (2011)
Article MathSciNet Google Scholar
Cartis, C., Gould, N .I .M., Toint, Ph L: Adaptive cubic regularisation methods for unconstrained optimization. Part II: Worst-case function—and derivative-evaluation complexity. Math. Program. 130(2), 295–319 (2011)
Article MathSciNet Google Scholar
Cartis, C., Gould, N.I.M., Toint, Ph.L.: Optimal Newton-type methods for nonconvex smooth optimization problems. Technical Report ERGO Technical Report 11-009, School of Mathematics, University of Edinburgh (2011)
Cartis, C., Gould, N.I.M., Toint, Ph.L.: Evaluation complexity bounds for smooth constrained nonlinear optimisation using scaled KKT conditions, high-order models and the criticality measure \(\chi \). CoRR. arXiv:1705.04895 (2017)
Cartis, C., Gould, N.I.M., Toint, PhL: Worst-case evaluation complexity of regularization methods for smooth unconstrained optimization using hölder continuous gradients. Optim. Methods Softw. 32(6), 1273–1298 (2017)
Article MathSciNet Google Scholar
Cartis, C., Scheinberg, K.: Global convergence rate analysis of unconstrained optimization methods based on probabilistic models. Math. Program. 169(2), 337–375 (2018)
Article MathSciNet Google Scholar
Conn, A.R., Gould, N.I.M., Toint, Ph.L.: Trust-Region Methods. Society for Industrial and Applied Mathematics (SIAM) (2000)
Curtis, F.E., Lubberts, Z., Robinson, D.P.: Concise complexity analyses for trust region methods. Optim. Lett. 12(8), 1713–1724 (2018)
Article MathSciNet Google Scholar
Curtis, F.E., Robinson, D.P.: Exploiting negative curvature in deterministic and stochastic optimization. Math. Program. Ser. B 176(1), 69–94 (2019)
Article MathSciNet Google Scholar
Curtis, F.E., Robinson, D.P., Samadi, M.: A trust region algorithm with a worst-case iteration complexity of \({\cal{O}}(\epsilon ^{-3/2})\) for nonconvex optimization. Math. Program. 162(1), 1–32 (2017)
Article MathSciNet Google Scholar
Curtis, F.E., Robinson, D.P., Samadi, M.: An inexact regularized Newton framework with a worst-case iteration complexity of \({\cal{O}}(\epsilon ^{-3/2})\) for nonconvex optimization. IMA J. Numer. Anal. (2018). https://doi.org/10.1093/imanum/dry022
Article Google Scholar
Dauphin, Y.N., Pascanu, R., Gulcehre, C., Cho, K., Ganguli, S., Bengio, Y.: Identifying and attacking the saddle point problem in high-dimensional non-convex optimization. In Proceedings of the International Conference On Neural Information Processing Systems, pp. 2933–2941 (2014)
Dussault, J.-P.: ARCq: a new adaptive regularization by cubics. Optim. Methods Softw. 33(2), 322–335 (2018)
Article MathSciNet Google Scholar
Dussault, J.-P., Orban, D.: Scalable adaptive cubic regularization methods. Technical Report G-2015-109, GERAD (2017)
Fan, J., Yuan, Y.: A new trust region algorithm with trust region radius converging to zero. In: Proceedings of the International Conference on Optimization: Techniques and Applications, ICOTA, pp. 786–794 (2001)
Ge, R., Huang, F., Jin, C., Yuan, Y.: Escaping from saddle points— online stochastic gradient for tensor decomposition. In: Proceedings of the Conference on Learning Theory, CoLT, pp. 797–842 (2015)
Gould, N.I.M., Porcelli, M., Toint, PhL: Updating the regularization parameter in the adaptive cubic regularization algorithm. Comput. Optim. Appl. 53(1), 1–22 (2012)
Article MathSciNet Google Scholar
Grapiglia, G.N., Yuan, J., Yuan, Y.: On the convergence and worst-case complexity of trust-region and regularization methods for unconstrained optimization. Math. Program. 152(1–2), 491–520 (2015)
Article MathSciNet Google Scholar
Grapiglia, G.N., Yuan, J., Yuan, Y.: Nonlinear stepsize control algorithms: complexity bounds for first-and second-order optimality. J. Optim. Theory Appl. 171(3), 980–997 (2016)
Article MathSciNet Google Scholar
Gratton, S., Royer, C.W., Vicente, L.N.: A decoupled first/second-order steps technique for nonconvex nonlinear unconstrained optimization with improved complexity bounds. Math. Program. (2018). https://doi.org/10.1007/s10107-018-1328-7
Article MATH Google Scholar
Jin, C., Ge, R., Netrapalli, P., Kakade, S.M., Jordan, M.I.: How to escape saddle points efficiently. In: Proceedings of the International Conference on Machine Learning, ICML, pp. 1724–1732 (2017)
Karimi, H., Nutini, J., Schmidt, M.: Linear convergence of gradient and proximal-gradient methods under the Polyak-łojasiewicz condition. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 795–811 (2016)
Lee, J.D., Panageas, I., Piliouras, G., Simchowitz, M., Jordan, M.I., Recht, B.: First-order methods almost always avoid strict saddle points. Math. Program. 176(1), 311–337 (2019)
Article MathSciNet Google Scholar
Lee, J.D., Simchowitz, M., Jordan, M.I., Recht, B.: Gradient descent only converges to minimizers. In: Proceedings of the Conference on Learning Theory, CoLT, pp. 1246–1257 (2016)
Liu, M., Li, Z., Wang, X., Yi, J., Yang, T.: Adaptive negative curvature descent with applications in non-convex optimization. In: Proceedings of the International Conference on Neural Information Processing Systems, NeurIPS, pp. 4854–4863 (2018)
Nesterov, Yu.: Introductory Lectures on Convex Optimization. Springer, New York (2004)
Book Google Scholar
Nesterov, Yu., Polyak, B.T.: Cubic regularization of Newton’s method and its global performance. Math. Program. 108(1), 117–205 (2006)
Article MathSciNet Google Scholar
Nocedal, J., Wright, S.J.: Numerical Optimization, Second edn. Springer, New York (2006)
MATH Google Scholar
Paternain, S., Mokhtari, A., Ribeiro, A.: A Newton-based method for nonconvex optimization with fast evasion of saddle points. SIAM J. Optim. 29(1), 343–368 (2019)
Article MathSciNet Google Scholar
Polyak, B.T.: Gradient methods for minimization of functionals. USSR Comput. Math. Math. Phys. 3(3), 643–653 (1963)
MathSciNet MATH Google Scholar
Royer, C., Wright, S.J.: Complexity analysis of second-order line-search algorithms for smooth nonconvex optimization. SIAM J. Optim. 28(2), 1448–1477 (2018)
Article MathSciNet Google Scholar
Smale, S.: On the average number of steps of the simplex method of linear programming. Math. Program. 27(3), 241–262 (1983)
Article MathSciNet Google Scholar
Spielman, D.A., Teng, S.-H.: Smoothed analysis of algorithms: Why the simplex algorithm usually takes polynomial time. J. Assoc. Comput. Mach. 51(3), 385–463 (2004)
Article MathSciNet Google Scholar
Toint, PhL: Nonlinear stepsize control, trust regions and regularizations for unconstrained optimization. Optim. Methods Softw. 28(1), 82–95 (2013)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Industrial and Systems Engineering, Lehigh University, Bethlehem, PA, 18015, USA
Frank E. Curtis & Daniel P. Robinson

Authors

Frank E. Curtis
View author publications
You can also search for this author in PubMed Google Scholar
Daniel P. Robinson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Frank E. Curtis.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supported by the U.S. Department of Energy, Office of Science, Early Career Research Program under Award Number DE–SC0010615 (Advanced Scientific Computing Research), and by the U.S. National Science Foundation under Award Numbers CCF-1740796 and CCF-1618717 (Division of Computing and Communication Foundations) and IIS-1704458 (Division of Information and Intelligent Systems).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Curtis, F.E., Robinson, D.P. Regional complexity analysis of algorithms for nonconvex smooth optimization. Math. Program. 187, 579–615 (2021). https://doi.org/10.1007/s10107-020-01492-3

Download citation

Received: 24 August 2018
Accepted: 12 March 2020
Published: 01 April 2020
Issue Date: May 2021
DOI: https://doi.org/10.1007/s10107-020-01492-3

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Regional complexity analysis of algorithms for nonconvex smooth optimization

Abstract

Access this article

Similar content being viewed by others

Worst-case evaluation complexity for unconstrained nonlinear optimization using high-order regularized models

A Newton-CG algorithm with complexity guarantees for smooth unconstrained optimization

Worst-case evaluation complexity of a derivative-free quadratic regularization method

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Regional complexity analysis of algorithms for nonconvex smooth optimization

Abstract

Access this article

Similar content being viewed by others

Worst-case evaluation complexity for unconstrained nonlinear optimization using high-order regularized models

A Newton-CG algorithm with complexity guarantees for smooth unconstrained optimization

Worst-case evaluation complexity of a derivative-free quadratic regularization method

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation