Abstract
Kurdyka–Łojasiewicz (KL) exponent plays an important role in estimating the convergence rate of many contemporary first-order methods. In particular, a KL exponent of \(\frac{1}{2}\) for a suitable potential function is related to local linear convergence. Nevertheless, KL exponent is in general extremely hard to estimate. In this paper, we show under mild assumptions that KL exponent is preserved via inf-projection. Inf-projection is a fundamental operation that is ubiquitous when reformulating optimization problems via the lift-and-project approach. By studying its operation on KL exponent, we show that the KL exponent is \(\frac{1}{2}\) for several important convex optimization models, including some semidefinite-programming-representable functions and some functions that involve \(C^2\)-cone reducible structures, under conditions such as strict complementarity. Our results are applicable to concrete optimization models such as group-fused Lasso and overlapping group Lasso. In addition, for nonconvex models, we show that the KL exponent of many difference-of-convex functions can be derived from that of their natural majorant functions, and the KL exponent of the Bregman envelope of a function is the same as that of the function itself. Finally, we estimate the KL exponent of the sum of the least squares function and the indicator function of the set of matrices of rank at most k.
Similar content being viewed by others
Notes
See Definition 2.1 for the precise definition.
We refer the readers to Sect. 2 for relevant definitions.
Here, f is a proper closed function, thanks to Lemma 2.1(i).
A gauge is a nonnegative positively homogeneous convex function that vanishes at the origin.
See [28, Proposition 2.1(iii)].
Notice that F is proper and closed thanks to the existence of the Slater point \((x^s,u^s,t^s)\).
Note that \(F_1\) is proper and closed thanks to the existence of the Slater point \((x^s,u^s,t^s)\).
Here and henceforth, \(U({\bar{x}},{\bar{u}},f({\bar{x}}))\) is a short-hand notation for the matrix vector product \(U\begin{bmatrix} {\bar{x}}\\ {\bar{u}}\\ f({\bar{x}}) \end{bmatrix}\).
In the case when \(\ker \mathcal {{\bar{A}}}=\{0\}\) so that the basis is empty (i.e., \(r = 0\)), we define \(\mathcal{H}\) to be the unique linear map that maps \(\mathcal{S}^d\) onto the zero vector space.
Recall that \(p\ge 0\). When \(p=0\), we interpret \({\bar{z}}\) as a null vector so that \(U({\bar{x}},{\bar{u}},f({\bar{x}})) = f({\bar{x}})\).
In the case when \(\ker \bar{\mathcal{A}} = \{0\}\) (i.e., \(r = 0\)), we have \({\tilde{Y}}+{\Vert {\hat{A}}_0\Vert _F^{-2}}{\hat{A}}_0 = 0\). In this case, we interpret \(\omega \) as a null vector.
We note that because of the Slater’s condition, the function F in (3.7) is proper and closed.
When \(r=0\), we set \({\bar{Z}} = 0\in \mathcal{S}^{m+n}\).
When \(r = 0\), this set is \(\{(0,0)\}\) and \({\bar{Z}} = 0\).
The quoted result is for \(C^1\)-cone reducibility. However, it is apparent from the proof how to adapt the result for \(C^2\)-cone reducibility.
References
M. Ahn, J. S. Pang and J. Xin, Difference-of-convex learning: directional stationarity, optimality, and sparsity, SIAM J. Optim. 27 (2017), 1637–1665.
C. M. Alaíz, Á. Barbero J. R. Dorronsoro, Group fused lasso, in Artificial Neural Networks and Machine Learning–ICANN 2013. Lecture Notes in Computer Science, vol 8131. (V. Mladenov, P. Koprinkova-Hristova, G. Palm, A. E. P. Villa, B. Appollini, N. Kasabov, eds) Springer, Berlin, Heidelberg, (2013), pp. 66–73.
F. J. Aragón Artacho and M. H. Geoffroy, Characterization of metric regularity of subdifferentials, J. Convex Anal. 15 (2008), 365–380.
H. Attouch and J. Bolte, On the convergence of the proximal algorithm for nonsmooth functions involving analytic features, Math. Program. 116 (2009), 5–16.
H. Attouch, J. Bolte, P. Redont and A. Soubeyran, Proximal alternating minimization and projection methods for nonconvex problems: an approach based on the Kurdyka-Łojasiewicz inequality, Math. Oper. Res. 35 (2010), 438–457.
H. Attouch, J. Bolte and B. F. Svaiter, Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods, Math. Program. 137 (2013), 91–129.
A. Auslender and M. Teboulle, Asymptotic Cones and Functions in Optimization and Variational Inequalities, Springer, 2003.
H. H. Bauschke and J. M. Borwein, On projection algorithms for solving convex feasibility problems, SIAM Review 38 (1996), 367–426.
H. H. Bauschke, J. M. Borwein and W. Li, Strong conical hull intersection property, bounded linear regularity, Jameson’s property (G), and error bounds in convex optimization, Math. Program. 86 (1999), 135–160.
H. H. Bauschke, P. L. Combettes and D. Noll, Joint minimization with alternating Bregman proximity operators, Pac. J. Optim. 2 (2006), 401–424.
A. Ben-Tal and A. Nemirovski, Lectures on Modern Convex Optimization: Analysis, Algorithms, and Engineering Applications, MPS-SIAM Series on Optimization, (2001).
J. Bolte, A. Daniilidis and A. Lewis, The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems, SIAM J. Optim. 17 (2007), 1205–1223.
J. Bolte, A. Daniilidis, A. Lewis and M. Shiota, Clarke subgradients of stratifiable functions, SIAM J. Optim. 18 (2007), 556–572.
J. Bolte, T. P. Nguyen, J. Peypouquet and B. W. Suter, From error bounds to the complexity of first-order descent methods for convex functions, Math. Program. 165 (2017), 471–507.
J. Bolte, S. Sabach and M. Teboulle, Proximal alternating linearized minimization for nonconvex and nonsmooth problems, Math. Program. 146 (2014), 459–494.
J. Borwein and A. Lewis, Convex Analysis and Nonlinear Optimization, 2nd edition, Springer, 2006.
S. Boyd, N. Parikh, E. Chu, B. Peleato and J. Eckstein, Distributed optimization and statistical learning via the alternating direction method of multipliers, Found. Trend. in Mach. Learn. 3 (2010), 1–122.
Y. Cui, C. Ding and X. Zhao, Quadratic growth conditions for convex matrix optimization problems associated with spectral functions, SIAM J. Optim. 27 (2017), 2332–2355.
Y. Cui, D. F. Sun and K. C. Toh, On the asymptotic superlinear convergence of the augmented Lagrangian method for semidefinite programming with multiple solutions, Preprint (2016). Available at https://arxiv.org/abs/1610.00875.
D. D’Acunto and K. Kurdyka, Explicit bounds for the Łojasiewicz exponent in the gradient inequality for polynomials, Ann. Polon. Math. 87 (2005), 51–61.
A. L. Dontchev and R. T. Rockafellar, Implicit Functions and Solution Mappings, Springer, New York, 2009.
D. Drusvyatskiy, A. D. Ioffe and A. S. Lewis, Nonsmooth optimization using Taylor-like models: error bounds, convergence, and termination criteria, Math. Program. 185 (2021), 357-383.
D. Drusvyatskiy and A. Lewis, Error bounds, quadratic growth, and linear convergence of proximal methods, Math. Oper. Res. 43 (2018), 919–948.
D. Drusvyatskiy, G. Li and H. Wolkowicz, A note on alternating projections for ill-posed semidefinite feasibility problems, Math. Program. 162 (2017), 537–548.
J. Fan, Comments on “wavelets in statistics: a review” by A. Antoniadis, J. Ital. Stat. Soc. 6 (1997), 131–138.
F. Fachinei and J.-S. Pang, Finite-Dimensional Variational Inequalities and Complementarity Problems, Springer, New York, 2003.
P. Frankel, G. Garrigos and J. Peypouquet, Splitting methods with variable metric for Kurdyka-Łojasiewicz functions and general convergence rates, J. Optim. Theory Appl. 165 (2015), 874–900.
M. P. Friedlander, I. Macêdo and T. K. Pong, Gauge optimization and duality, SIAM J. Optim. 24 (2014), 1999–2022.
J. W. Helton and J. Nie, Semidefinite representation of convex sets, Math. Program. 122 (2010), 21–64.
R. Jiang and D. Li, Novel reformulations and efficient algorithms for the generalized trust region subproblem, SIAM J. Optim. 29 (2019), 1603–1633.
K. Kurdyka, On gradients of functions definable in o-minimal structures, Ann. Inst. Fourier 48 (1998) 769–783.
G. Li, B. S. Mordukhovich and T. S. Pham, New fractional error bounds for polynomial systems with applications to Hölderian stability in optimization and spectral theory of tensors, Math. Program. 153 (2015), 333–362.
G. Li and T. K. Pong, Douglas-Rachford splitting for nonconvex optimization with application to nonconvex feasibility problems, Math. Program. 159 (2016), 371–401.
G. Li and T. K. Pong, Calculus of the exponent of Kurdyka-Łojasiewicz inequality and its applications to linear convergence of first-order methods, Found. Comput. Math. 18 (2018), 1199–1232.
T. Liu and T. K. Pong, Further properties of the forward-backward envelope with applications to difference-of-convex programming, Comput. Optim. Appl. 67 (2017), 489–520.
T. Liu, T. K. Pong and A. Takeda, A refined convergence analysis of pDCA\(_e\) with applications to simultaneous sparse recovery and outlier detection, Comput. Optim. Appl. 73 (2019), 69–100.
H. Liu, W. Wu and A. M.-C. So, Quadratic optimization with orthogonality constraints: explicit Łojasiewicz exponent and linear convergence of line-search methods, ICML, (2016), pp. 1158–1167.
S. Łojasiewicz, Une propriété topologique des sous-ensembles analytiques réels, in Les Équations aux Dérivées Partielles, Éditions du Centre National de la Recherche Scientifique, Paris, 1963, pp. 87–89.
B. F. Lourenço, M. Muramatsu and T. Tsuchiya, Facial reduction and partial polyhedrality, SIAM J. Optim. 28 (2018), 2304–2326.
Z. Q. Luo, J. S. Pang and D. Ralph, Mathematical Programs with Equilibrium Constraints, Cambridge University Press, Cambridge, 1996.
Z. Q. Luo and P. Tseng, Error bounds and convergence analysis of feasible descent methods: a general approach, Ann. Oper. Res. 46 (1993), 157–178.
N. Parikh and S. P. Boyd, Proximal algorithms, Found. Trends Optimiz. 1 (2013), 123–231.
G. Pataki, The geometry of semidefinite programming, in Handbook of semidefinite programming (H. Wolkowicz, R. Saigal, and L. Vandenberghe, eds.), Kluwer Acad. Publ., Boston, MA, (2000), pp. 29–65.
B. Recht, M. Fazel and P. Parrilo, Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization, SIAM Review 52 (2010), 471–501.
R. T. Rockafellar, Convex Analysis, Princeton University Press, 1970.
R. T. Rockafellar and R. J.-B. Wets, Variational Analysis, Springer, 1998.
A. Shapiro, Sensitivity analysis of generalized equations, J. Math. Sci. 115 (2003), 2554–2565.
A. Shapiro and K. Scheinberg, Duality and optimality conditions, in Handbook of semidefinite programming, Kluwer Acad. Publ., Boston, MA, 2000, pp. 67–110.
L. Stella, A. Themelis and P. Patrinos, Forward-backward quasi-Newton methods for nonsmooth optimization problems, Comput. Optim. Appl. 67 (2017), 443–487.
J. F. Sturm, Error bounds for linear matrix inequalities, SIAM J. Optim. 10 (2000), 1228–1248.
H. Tuy, Convex Analysis and Global Optimization, Springer, 2nd edition, (2016).
L. Tunçel and H. Wolkowicz, Strong duality and minimal representations for cone optimization, Comput. Optim. Appl. 53 (2012), 619–648.
P. Tseng and S. Yun, A coordinate gradient descent method for nonsmooth separable minimization, Math. Program. 117 (2009), 387–423.
M. Udell, C. Horn, R. Zadeh and S. Boyd, Generalized low rank models, Found. Trends in Mach. Learn. 9 (2016), 1–118.
A. Watson, Characterization of the subdifferential of some matrix norms, Linear Algebra Appl. 170 (1992), 33–45.
B. Wen, X. Chen and T. K. Pong, A proximal difference-of-convex algorithm with extrapolation, Comput. Optim. Appl. 69 (2018), 297–324.
P. Yin, Y. Lou, Q. He and J. Xin, Minimization of \(\ell _{1-2}\) for compressed sensing, SIAM J. Sci. Comput. 37 (2015), A536–A563.
M. Yue, Z. Zhou and A. M.-C. So, A family of inexact SQA methods for non-smooth convex minimization with provable convergence guarantees based on the Luo-Tseng error bound property, Math. Program. 174 (2019), 327–358.
C.-H. Zhang, Nearly unbiased variable selection under minimax concave penalty, Ann. Stat. 38 (2010), 894–942.
Z. Zhou and A. M.-C. So, A unified approach to error bounds for structured convex optimization problems, Math. Program. 165 (2017), 689–728.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by James Renegar.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Guoyin Li: This author is partially supported by a Future fellowship from Australian Research Council (FT130100038) and a discovery project from Australian Research Council (DP190100555).
Ting Kei Pong: This author was supported partly by Hong Kong Research Grants Council PolyU153005/17p.
Rights and permissions
About this article
Cite this article
Yu, P., Li, G. & Pong, T.K. Kurdyka–Łojasiewicz Exponent via Inf-projection. Found Comput Math 22, 1171–1217 (2022). https://doi.org/10.1007/s10208-021-09528-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10208-021-09528-6
Keywords
- First-order methods
- Convergence rate
- Kurdyka–Łojasiewicz inequality
- Kurdyka–Łojasiewicz exponent
- Inf-projection