The generalized trust region subproblem: solution complexity and convex hull results

Wang, Alex L.; Kılınç-Karzan, Fatma

doi:10.1007/s10107-020-01560-8

The generalized trust region subproblem: solution complexity and convex hull results

Full Length Paper
Series A
Published: 10 October 2020

Volume 191, pages 445–486, (2022)
Cite this article

Mathematical Programming Submit manuscript

1023 Accesses
15 Citations
1 Altmetric
Explore all metrics

Abstract

We consider the generalized trust region subproblem (GTRS) of minimizing a nonconvex quadratic objective over a nonconvex quadratic constraint. A lifting of this problem recasts the GTRS as minimizing a linear objective subject to two nonconvex quadratic constraints. Our first main contribution is structural: we give an explicit description of the convex hull of this nonconvex set in terms of the generalized eigenvalues of an associated matrix pencil. This result may be of interest in building relaxations for nonconvex quadratic programs. Moreover, this result allows us to reformulate the GTRS as the minimization of two convex quadratic functions in the original space. Our next set of contributions is algorithmic: we present an algorithm for solving the GTRS up to an $\epsilon $ additive error based on this reformulation. We carefully handle numerical issues that arise from inexact generalized eigenvalue and eigenvector computations and establish explicit running time guarantees for these algorithms. Notably, our algorithms run in linear (in the size of the input) time. Furthermore, our algorithm for computing an $\epsilon $-optimal solution has a slightly-improved running time dependence on $\epsilon $ over the state-of-the-art algorithm. Our analysis shows that the dominant cost in solving the GTRS lies in solving a generalized eigenvalue problem—establishing a natural connection between these problems. Finally, generalizations of our convex hull results allow us to apply our algorithms and their theoretical guarantees directly to equality-, interval-, and hollow-constrained variants of the GTRS. This gives the first linear-time algorithm in the literature for these variants of the GTRS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Local nonglobal minima for solving large-scale extended trust-region subproblems

Article 02 September 2016

A matrix-free trust-region newton algorithm for convex-constrained optimization

Article 17 September 2021

Solving trust region subproblems using Riemannian optimization

Article 23 June 2023

Notes

In fact, this assumption can be made without loss of generality; see Remark 1.
Our definition of accuracy is presented in (9).
Recall that a convex quadratic function $x^\top A x+ 2b^\top x + c$ is 2L-smooth if and only if $A\preceq LI$.

References

Adachi, S., Nakatsukasa, Y.: Eigenvalue-based algorithm and analysis for nonconvex QCQP with one constraint. Math. Program. 173(1), 79–116 (2019)
Article MathSciNet MATH Google Scholar
Ben-Tal, A., den Hertog, D.: Hidden conic quadratic representation of some nonconvex quadratic optimization problems. Math. Program. 143(1), 1–29 (2014)
Article MathSciNet MATH Google Scholar
Ben-Tal, A., Teboulle, M.: Hidden convexity in some nonconvex quadratically constrained quadratic programming. Math. Program. 72(1), 51–63 (1996)
Article MathSciNet MATH Google Scholar
Buchheim, C., De Santis, M., Palagi, L., Piacentini, M.: An exact algorithm for nonconvex quadratic integer minimization using ellipsoidal relaxations. SIAM J. Optim. 23(3), 1867–1889 (2013)
Article MathSciNet MATH Google Scholar
Burer, S., Kılınç-Karzan, F.: How to convexify the intersection of a second order cone and a nonconvex quadratic. Math. Program. 162(1), 393–429 (2017)
Article MathSciNet MATH Google Scholar
Conn, A.R., Gould, N.I.M., Toint, P.L.: Trust Region Methods. MOS-SIAM Series on Optimization. SIAM, Philadelphia (2000)
Google Scholar
Fallahi, S., Salahi, M., Terlaky, T.: Minimizing an indefinite quadratic function subject to a single indefinite quadratic constraint. Optimization 67(1), 55–65 (2018)
Article MathSciNet MATH Google Scholar
Feng, J.-M., Lin, G.-X., Sheu, R.-L., Xia, Y.: Duality and solutions for quadratic programming over single non-homogeneous quadratic constraint. J. Glob. Optim. 54(2), 275–293 (2012)
Article MathSciNet MATH Google Scholar
Fortin, C., Wolkowicz, H.: The trust region subproblem and semidefinite programming. Optim. Methods Softw. 19(1), 41–67 (2004)
Article MathSciNet MATH Google Scholar
Fradkov, A.L., Yakubovich, V.A.: The S-procedure and duality relations in nonconvex problems of quadratic programming. Vestn. LGU Ser. Mat. Mekh. Astron 6(1), 101–109 (1979)
MATH Google Scholar
Gould, N.I.M., Lucidi, S., Roma, M., Toint, P.L.: Solving the trust-region subproblem using the Lanczos method. SIAM J. Optim. 9(2), 504–525 (1999)
Article MathSciNet MATH Google Scholar
Guo, C.-H., Higham, N.J., Tisseur, F.: An improved arc algorithm for detecting definite Hermitian pairs. SIAM J. Matrix Anal. Appl. 31(3), 1131–1151 (2009)
Article MathSciNet MATH Google Scholar
Hazan, E., Koren, T.: A linear-time algorithm for trust region problems. Math. Program. 158(1), 363–381 (2016)
Article MathSciNet MATH Google Scholar
Ho-Nguyen, N., Kılınç-Karzan, F.: Online first-order framework for robust convex optimization. Oper. Res. 66(6), 1670–1692 (2018)
Article MathSciNet MATH Google Scholar
Ho-Nguyen, N., Kılınç-Karzan, F.: A second-order cone based approach for solving the trust region subproblem and its variants. SIAM J. Optim. 27(3), 1485–1512 (2017)
Article MathSciNet MATH Google Scholar
Jiang, R., Li, D.: Simultaneous diagonalization of matrices and its applications in quadratically constrained quadratic programming. SIAM J. Optim. 26(3), 1649–1668 (2016)
Article MathSciNet MATH Google Scholar
Jiang, R., Li, D.: Novel reformulations and efficient algorithms for the generalized trust region subproblem. SIAM J. Optim. 29(2), 1603–1633 (2019)
Article MathSciNet MATH Google Scholar
Jiang, R., Li, D.: A linear-time algorithm for generalized trust region subproblems. SIAM J. Optim. 30(1), 915–932 (2020)
Article MathSciNet MATH Google Scholar
Jiang, R., Li, D., Baiyi, W.: SOCP reformulation for the generalized trust region subproblem via a canonical form of two symmetric matrices. Math. Program. 169(2), 531–563 (2018)
Article MathSciNet MATH Google Scholar
Kılınç-Karzan, F., Yıldız, S.: Two-term disjunctions on the second-order cone. Math. Program. 154(1), 463–491 (2015)
Article MathSciNet MATH Google Scholar
Kuczynski, J., Wozniakowski, H.: Estimating the largest eigenvalue by the power and Lanczos algorithms with a random start. SIAM J. Matrix Anal. Appl. 13(4), 1094–1122 (1992)
Article MathSciNet MATH Google Scholar
Locatelli, M.: Some results for quadratic problems with one or two quadratic constraints. Oper. Res. Lett. 43(2), 126–131 (2015)
Article MathSciNet MATH Google Scholar
Megiddo, N.: Linear-time algorithms for linear programming in ${\mathbb{R}}^3$ and related problems. SIAM J. Comput. 12(4), 759–776 (1983)
MathSciNet MATH Google Scholar
Modaresi, S., Vielma, J.P.: Convex hull of two quadratic or a conic quadratic and a quadratic inequality. Math. Program. 164(1–2), 383–409 (2017)
Article MathSciNet MATH Google Scholar
Moré, J.J.: Generalizations of the trust region problem. Optim. Methods Softw. 2(3–4), 189–209 (1993)
Article Google Scholar
Moré, J.J., Sorensen, D.C.: Computing a trust region step. SIAM J. Sci. Stat. Comput. 4(3), 553–572 (1983)
Article MathSciNet MATH Google Scholar
Nesterov, Y.: Lectures on Convex Optimization. Springer Optimization and Its Applications, 2nd edn. Springer, Basel (2018)
Book MATH Google Scholar
Nocedal, J., Wright, S.J.: Numerical Optimization. Springer Series in Operations Research and Financial Engineering. Springer, New York (2006)
Google Scholar
Pólik, I., Terlaky, T.: A survey of the S-lemma. SIAM Rev. 49(3), 371–418 (2007)
Article MathSciNet MATH Google Scholar
Pong, T.K., Wolkowicz, H.: The generalized trust region subproblem. Comput. Optim. Appl. 58(2), 273–322 (2014)
Article MathSciNet MATH Google Scholar
Rendl, F., Wolkowicz, H.: A semidefinite framework for trust region subproblems with applications to large scale minimization. Math. Program. 77(2), 273–299 (1997)
MathSciNet MATH Google Scholar
Salahi, M., Taati, A.: An efficient algorithm for solving the generalized trust region subproblem. Comput. Appl. Math. 37(1), 395–413 (2018)
Article MathSciNet MATH Google Scholar
Stern, R.J., Wolkowicz, H.: Indefinite trust region subproblems and nonsymmetric eigenvalue perturbations. SIAM J. Optim. 5(2), 286–313 (1995)
Article MathSciNet MATH Google Scholar
Wang, A.L., Kılınç-Karzan, F.: On the tightness of SDP relaxations of QCQPs. Technical Report, ArXiV (2019). arXiv:1911.09195
Wang, J., Xia, Y.: A linear-time algorithm for the trust region subproblem based on hidden convexity. Optim. Lett. 11(8), 1639–1646 (2017)
Article MathSciNet MATH Google Scholar
Yang, B., Anstreicher, K., Burer, S.: Quadratic programs with hollows. Math. Program. 170(2), 541–553 (2018)
Article MathSciNet MATH Google Scholar
Yıldıran, U.: Convex hull of two quadratic constraints is an LMI set. IMA J. Math. Control Inf. 26(4), 417–450 (2009)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Carnegie Mellon University, Pittsburgh, PA, 15213, USA
Alex L. Wang & Fatma Kılınç-Karzan

Authors

Alex L. Wang
View author publications
You can also search for this author in PubMed Google Scholar
Fatma Kılınç-Karzan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fatma Kılınç-Karzan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This research is supported in part by National Science Foundation Grant CMMI 1454548.

Appendices

Proofs of Theorems 3 and 4

In this appendix, we outline how to modify the proofs of Theorems 1 and 2 to prove Theorems 3 and 4.

Theorem 3 Suppose there exists $\gamma ^*\ge 0$ such that $A(\gamma ^*)\succ 0$. Consider the closed nonempty interval $\varGamma {:}{=}\left\{ \gamma \in {{\mathbb {R}}}_+:\, A(\gamma )\succeq 0\right\} $. Let $\gamma _-$ denote its leftmost endpoint.

If $\varGamma $ is bounded above, let $\gamma _+$ denote its rightmost endpoint. Then,
$$\begin{aligned} {{\,\mathrm{conv}\,}}(\mathcal{S}) = \mathcal{S}(\gamma _-)\cap \mathcal{S}(\gamma _+). \end{aligned}$$
In particular, we have $\min _{x\in {{\mathbb {R}}}^n}\left\{ q_0(x):\, q_1(x)\le 0\right\} = \min _{x\in {{\mathbb {R}}}^n} \max \left\{ q(\gamma _-, x),\, q(\gamma _+,x)\right\} $.
If $\varGamma $ is not bounded above, then $q_1(x)$ is convex and
$$\begin{aligned} {{\,\mathrm{conv}\,}}(\mathcal{S}) = \mathcal{S}(\gamma _-)\cap \left\{ (x,t)\in {{\mathbb {R}}}^{n+1}:\, q_1(x)\le 0\right\} . \end{aligned}$$
In particular, we have $\min _{x\in {{\mathbb {R}}}^n}\left\{ q_0(x):\, q_1(x)\le 0\right\} = \min _{x\in {{\mathbb {R}}}^n} \left\{ q(\gamma _-,x):\, q_1(x)\le 0\right\} $.

Proof

The “$\subseteq $” inclusions follow from a trivial modification of Lemma 2. It suffices to prove the “$\supseteq $” inclusions. The case where $A_0$ and $A_1$ are both nonconvex is covered by Theorem 1. We consider the four remaining cases:

Suppose $A_0$ and $A_1$ are both convex. In this case, $\varGamma = [0,\infty )$ and it suffices to show that ${{\,\mathrm{conv}\,}}(\mathcal{S}) = \left\{ (x,t):\, q_0(x)\le t,\, q_1(x) \le 0\right\} = \mathcal{S}$. This holds as $\mathcal{S}$ is convex.
Suppose $A_0$ is nonconvex and $A_1$ is convex. In this case, $\varGamma = [\gamma _-, \infty )$ is unbounded above. Furthermore, $\gamma _-$ is positive and $A(\gamma _-)$ has a zero eigenvalue. Suppose $({\hat{x}},{\hat{t}})$ satisfies $q(\gamma _-, {\hat{x}})\le {\hat{t}}$ and $q_1({\hat{x}})\le 0$. If $q_1({\hat{x}}) = 0$, then we also have $q_0({\hat{x}}) = q(\gamma _-,{\hat{x}}) \le {\hat{t}}$, whence $({\hat{x}},{\hat{t}})\in \mathcal{S}$. On the other hand, if $q_1({\hat{x}}) < 0$, we may apply the argument in case (iii) in the proof of Lemma 3 verbatim (after replacing all occurrences of $\gamma _+$ by $\gamma ^*$) to conclude that $({\hat{x}},{\hat{t}})\in {{\,\mathrm{conv}\,}}(\mathcal{S})$.
Suppose $A_0$ is convex and $A_1$ is nonconvex. In this case, $\varGamma = [0,\gamma _+]$ is bounded above and $\gamma _-$ is defined to be $\gamma _-=0$. Furthermore, $A(\gamma _+)$ has a zero eigenvalue. Suppose $({\hat{x}},{\hat{t}})\in \mathcal{S}(\gamma _-)\cap \mathcal{S}(\gamma _+)$. If $q_1({\hat{x}}) \le 0$, then we also have $q_0({\hat{x}}) = q(\gamma _-,{\hat{x}}) \le {\hat{t}}$, whence $({\hat{x}},{\hat{t}})\in \mathcal{S}$. On the other hand, if $q_1({\hat{x}}) > 0$, we may apply the argument in case (ii) in the proof of Lemma 3 verbatim to conclude that $({\hat{x}},{\hat{t}})\in {{\,\mathrm{conv}\,}}(\mathcal{S})$.

$\square $

We will prove Theorem 4 using a limiting argument and reducing it to Theorem 3. The proof follows that of Lemma 5 almost verbatim.

Theorem 4 Suppose there exists $\gamma ^*\ge 0$ such that $A(\gamma ^*)\succeq 0$. Consider the closed nonempty interval $\varGamma {:}{=}\left\{ \gamma \in {{\mathbb {R}}}_+:\, A(\gamma )\succeq 0\right\} $. Let $\gamma _-$ denote its leftmost endpoint.

If $\varGamma $ is bounded above, let $\gamma _+$ denote its rightmost endpoint. Then,
$$\begin{aligned} \overline{{{\,\mathrm{conv}\,}}}(\mathcal{S}) = \mathcal{S}(\gamma _-)\cap \mathcal{S}(\gamma _+). \end{aligned}$$
In particular, $\inf _{x\in {{\mathbb {R}}}^n}\left\{ q_0(x):\, q_1(x)\le 0\right\} = \inf _{x\in {{\mathbb {R}}}^n} \max \left\{ q(\gamma _-, x),\, q(\gamma _+,x)\right\} $.
If $\varGamma $ is not bounded above, then $q_1(x)$ is convex and
$$\begin{aligned} \overline{{{\,\mathrm{conv}\,}}}(\mathcal{S}) = \mathcal{S}(\gamma _-)\cap \left\{ (x,t)\in {{\mathbb {R}}}^{n+1}:\, q_1(x)\le 0\right\} . \end{aligned}$$
In particular, $\inf _{x\in {{\mathbb {R}}}^n}\left\{ q_0(x):\, q_1(x)\le 0\right\} = \inf _{x\in {{\mathbb {R}}}^n} \left\{ q(\gamma _-,x):\, q_1(x)\le 0\right\} $.

Proof

The “$\subseteq $” inclusions follow from a trivial modification of Lemma 4. It suffices to prove the “$\supseteq $” inclusions.

Denote the set on the right hand side by $\mathcal{R}$, i.e., $\mathcal{R}{:}{=}\mathcal{S}(\gamma _-) \cap \mathcal{S}(\gamma _+)$ when $\varGamma $ is bounded and $\mathcal{R}{:}{=}\mathcal{S}(\gamma _-)\cap \left\{ (x,t):\, q_1(x)\le 0\right\} $ when $\varGamma $ is unbounded.

Let $({\hat{x}},{\hat{t}})\in \mathcal{R}$. It suffices to show that $({\hat{x}},\hat{t}+\epsilon )\in {{\,\mathrm{conv}\,}}(\mathcal{S})$ for all $\epsilon >0$.

We will perturb $A_0$ slightly to create a new instance of the problem. Let $\delta >0$ to be picked later. Define $A_0' = A_0 +\delta I_n$ and let all remaining data be unchanged, i.e.,

$$\begin{aligned} q_0'(x)&{:}{=}x^\top A_0' x + 2b_0'^\top x + c_0' {:}{=}x^\top (A_0+\delta I_n) x + 2b_0^\top x + c_0\\ q_1'(x)&{:}{=}x^\top A_1' x + 2b_1'^\top x + c_1' {:}{=}x^\top A_1 x + 2b_1^\top x + c_1. \end{aligned}$$

We will denote all quantities related to the perturbed system with an apostrophe.

We claim it suffices to show that there exists $\delta >0$ small enough such that $({\hat{x}},{\hat{t}}+\epsilon )\in \mathcal{R}'$. Indeed, suppose this is the case. Note that for any $x\in {{\mathbb {R}}}^n$, we have $q_1(x) = q_1'(x)$ and $q_0(x) \le q_0'(x)$. Hence, ${{\,\mathrm{conv}\,}}(\mathcal{S}')\subseteq {{\,\mathrm{conv}\,}}(\mathcal{S})$. Then, noting that $A'(\gamma ^*) = A(\gamma ^*) + \delta I_n\succ 0$, we may apply Theorem 3 to the perturbed system to get $({\hat{x}},{\hat{t}}+\epsilon )\in \mathcal{R}' = {{\,\mathrm{conv}\,}}(\mathcal{S}')\subseteq {{\,\mathrm{conv}\,}}(\mathcal{S})$ as desired.

First note that $A_1 = A_1'$ so that $\varGamma $ is bounded if and only if $\varGamma '$ is bounded. We will then pick $\delta >0$ small enough such that

$$\begin{aligned} \delta \left\Vert {\hat{x}} \right\Vert ^2 \le \frac{\epsilon }{2},\qquad \left|\gamma _-' -\gamma _- \right|\left|q_1({\hat{x}}) \right|\le \frac{\epsilon }{2},\qquad \left|\gamma _+' -\gamma _+ \right|\left|q_1({\hat{x}}) \right|\le \frac{\epsilon }{2}, \end{aligned}$$

where the last condition is only required when $\gamma _+$ and $\gamma _+'$ both exist. This is possible as the expression on the left of each inequality is continuous in $\delta $ and is strictly satisfied when $\delta =0$.

The following computation shows that $q'(\gamma _-', {\hat{x}}) \le {\hat{t}} + \epsilon $.

$$\begin{aligned} q'(\gamma _-', {\hat{x}}) -({\hat{t}}+ \epsilon )&= q'(\gamma _-, {\hat{x}}) -({\hat{t}}+ \epsilon ) + (\gamma _-' - \gamma _-) q_1({\hat{x}})\\&\le q(\gamma _-, {\hat{x}}) + \delta \left\Vert {\hat{x}} \right\Vert ^2 -({\hat{t}}+ \epsilon ) + \left|\gamma _-' - \gamma _- \right|\left|q_1({\hat{x}}) \right|\\&\le q(\gamma _-,{\hat{x}}) - {\hat{t}}\\&\le 0 \end{aligned}$$

The first inequality follows by noting $q'(\gamma ,x) = q(\gamma ,x) + \delta \left\Vert x \right\Vert ^2$, the second inequality follows from our assumptions on $\delta $, and the third inequality follows from the assumption that $({\hat{x}}, {\hat{t}})\in \mathcal{S}(\gamma _-)$. Thus $(\hat{x},{\hat{t}}+\epsilon )\in \mathcal{S}'(\gamma _-')$. When $\varGamma $ is bounded (or equivalently, when $\gamma _+'$ and $\gamma _+$ exist), a similar calculation shows that $q'(\gamma _+',{\hat{x}}) - ({\hat{t}} + \epsilon )\le 0$ so that $({\hat{x}},\hat{t}+\epsilon )\in \mathcal{S}'(\gamma '_+)$. Finally, when $\varGamma $ is unbounded we have $q_1'({\hat{x}}) = q_1({\hat{x}})\le 0$ so that $({\hat{x}},\hat{t}+\epsilon )\in \left\{ (x,t):\, q_1'(x)\le 0\right\} $. Thus, $({\hat{x}},\hat{t}+\epsilon )$ is in $\mathcal{R}'$, concluding the proof. $\square $

Estimation of the regularity parameters

In Sect. 4 we gave algorithms to solve the GTRS assuming that we had access to $(\xi ,\zeta )$ and ${\hat{\gamma }}$ satisfying Assumption 5. In this appendix, we show how to compute these quantities.

Let $q_0,q_1$ satisfy Assumption 4. Recall the definitions

$$\begin{aligned} \xi ^*&{:}{=}\min \left\{ 1,\max _{\gamma \ge 0} \lambda _{\min }(A(\gamma ))\right\} , \zeta ^* {:}{=}\max \left\{ 1,\gamma _+\right\} . \end{aligned}$$

We will find $(\xi ,\zeta )$ satisfying

$$\begin{aligned} \xi ^*/4\le \xi \le \xi ^*, \zeta ^* \le \zeta \le 4\zeta ^* \end{aligned}$$

and a ${\hat{\gamma }}$ such that $\lambda _{\min }(A({\hat{\gamma }}))\ge \xi $.

We will accomplish this in two stages. We begin by estimating $\xi ^*$ using only an upper bound ${\bar{\zeta }}$ of $\zeta ^*$. Then using our estimate $\xi $ we will compute $\zeta $.

1.1 Computing $\xi $ and ${\hat{\gamma }}$

We start with the following guarantee for the algorithm TestXi (Algorithm 5).

Lemma 17

Given $q_0,q_1$ satisfying Assumption 4, an arbitrary $0<\xi \le 1$, an upper bound ${\bar{\zeta }}\ge \zeta ^*$, and a failure probability $p_{\xi }>0$, TestXi (Algorithm 5) will output

$$\begin{aligned} {\left\{ \begin{array}{ll} {\hat{\gamma }} \text { such that } \lambda _{\min }(A({\hat{\gamma }}))\ge \xi /2 &{} \text {if } \xi \le \xi ^*\\ {\hat{\gamma }} \text { such that } \lambda _{\min }(A({\hat{\gamma }}))\ge \xi /2 \text { or ``Fail''} &{} \text {if } \xi ^*<\xi \le 2\xi ^*\\ \text {``Fail''} &{} \text {if }2\xi ^*<\xi \end{array}\right. } \end{aligned}$$

with probability $1-p_{\xi }$. This algorithm runs in time

$$\begin{aligned} {\tilde{O}}\left( N\sqrt{\frac{{\bar{\zeta }}}{\xi }} \log \left( \frac{n}{p_\xi }\right) \log \left( \frac{{\bar{\zeta }}}{\xi }\right) \right) . \end{aligned}$$

Proof

We condition on the event that ApproxEig succeeds every time it is called. By the union bound, this happens with probability at least $1-p_\xi $.

As we have conditioned on ApproxEig succeeding, any ${\hat{\gamma }}$ which is output by TestXi will satisfy

$$\begin{aligned} \lambda _{\min }(A({\hat{\gamma }}))\ge 3\xi /4 - \xi /4 =\xi /2. \end{aligned}$$

It is clear that TestXi will output “Fail” if $\xi >2\xi ^*$ as there does not exist any ${\hat{\gamma }}$ such that $\lambda _{\min }(A({\hat{\gamma }}))\ge \xi ^*$. It remains to show that, given $\xi \le \xi ^*$, TestXi will output some ${\hat{\gamma }}$.

For the sake of contradiction, assume that the algorithm fails to output in each of the T rounds. Let $P {:}{=}\left\{ \gamma :\, \lambda _{\min }(A(\gamma ))\ge 3\xi ^*/4\right\} $. Recall that $\lambda _{\min }(A(\gamma ))$ is 1-Lipschitz in $\gamma $. As there exists some $\gamma $ such that $\lambda _{\min }(A(\gamma ))\ge \xi ^*$ (see Definition 1), we conclude that P is an interval of length at least $\xi ^*/2$.

Note that $P\subseteq [s_0,t_0]$. We will inductively show that $P\subseteq [s_k,t_k]$ for each $k\in \left\{ 1,\dots ,T\right\} $. Let $k\in \left\{ 0,\dots ,T-1\right\} $ and let $s_k,{\bar{\gamma }},t_k$ be defined as in the algorithm and let x be the unit vector found in step 3.(d). We claim that $x^\top A_1x\ne 0$. Indeed suppose $x^\top A_1x = 0$, then $x^\top A(\gamma )x = x^\top A({\bar{\gamma }})x \le 3\xi /4$ for all $\gamma $. This contradicts the assumption that there exists some $\gamma $ such that $\lambda _{\min }(A(\gamma ))\ge \xi $. Now suppose $\gamma \in P$, then

$$\begin{aligned} \frac{3\xi ^*}{4}&\le x^\top A(\gamma )x = x^\top A({\bar{\gamma }})x + (\gamma -{\bar{\gamma }})x^\top A_1 x \le \frac{3\xi ^*}{4} +(\gamma -{\bar{\gamma }})x^\top A_1x, \end{aligned}$$

where the first inequality follows from $\gamma \in P$, and the last one from the fact that the algorithm did not output in iteration k (and thus the if statement in step 3.(d) did not hold). Thus, if $x^\top A_1x > 0$, then we have the implication $\gamma \in P \implies \gamma \ge {\bar{\gamma }}$. Similarly, if $x^\top A_1x<0$, then we have the implication $\gamma \in P\implies \gamma \le {\bar{\gamma }}$. Then by induction, we have $P\subseteq [s_{k+1},t_{k+1}]$.

We conclude that P, an interval of length at least $\xi ^*/2$, is contained in $[s_T,t_T]$ an interval of length

$$\begin{aligned} t_T - s_T = \frac{t_0 - s_0}{2^T} \le \xi /4. \end{aligned}$$

Noting that $\xi \le \xi ^*$ gives us the desired contradiction.

The running time of this algorithm follows from Lemma 11. $\square $

Given a lower bound $\xi \le \xi ^*$, Lemma 17 guarantees that TestXi will find a ${\hat{\gamma }}$ satisfying $\lambda _{\min }(A({\hat{\gamma }}))\ge \xi /2$ with high probability. In order to make use of this lemma without a lower bound on $\xi ^*$, we will simply repeatedly call TestXi with decreasing guesses for $\xi $. Consider Algorithm 6.

Theorem 8

Given $q_0,q_1$ satisfying Assumption 4, an upper bound ${\bar{\zeta }}\ge \zeta ^*$, and a failure probability $p>0$, ApproxXi (Algorithm 6) will output $\xi $ and ${\hat{\gamma }}$ such that

$$\begin{aligned} \xi ^*/4\le \xi \le \xi ^*, \lambda _{\min }(A({\hat{\gamma }}))\ge \xi \end{aligned}$$

and run in time

$$\begin{aligned} {\tilde{O}}\left( N\sqrt{\frac{{\bar{\zeta }}}{\xi ^*}}\log \left( \frac{n}{p}\right) \log \left( {\bar{\zeta }}\right) \log \left( \frac{1}{\xi ^*}\right) ^3\right) \end{aligned}$$

with probability $1-p$.

Proof

We condition on the event that TestXi succeeds every time it is called. By the union bound, this happens with probability at least $1-p$.

Let $k^*\in \left\{ 1,2,\dots \right\} $ be such that $\xi ^*/2\le 2^{-k^*}<\xi ^*$. Then, as we have conditioned on TestXi succeeding, Lemma 17 guarantees that TestXi$(q_0, q_1, 2^{-k},{\bar{\zeta }}, 2^{-(k+1)}p)$ outputs

$$\begin{aligned} {\left\{ \begin{array}{ll} {\hat{\gamma }} \text { such that } \lambda _{\min }(A({\hat{\gamma }}))\ge 2^{-k} &{} \text {if } 2^{-k}\le \xi ^*/2\\ {\hat{\gamma }} \text { such that } \lambda _{\min }(A({\hat{\gamma }}))\ge 2^{-k} \text { or ``Fail''} &{} \text {if } \xi ^*/2<2^{-k}\le \xi ^*\\ \text {``Fail''} &{} \text {if }\xi ^*<2^{-k}. \end{array}\right. } \end{aligned}$$

Thus, TestXi will output “Fail” for every $k<k^*$ and will output ${\hat{\gamma }}$ either on round $k^*$ or $k^*+1$. We can then bound

$$\begin{aligned} \lambda _{\min }(A({\hat{\gamma }})) \ge 2^{-(k^* +1)}\ge \frac{\xi ^*}{4}. \end{aligned}$$

We bound the run time of the algorithm as follows.

$$\begin{aligned}&\sum _{k=1}^{k^* +1} {\tilde{O}}\left( N\sqrt{\frac{{\bar{\zeta }}}{2^{-(k-1)}}} \log \left( \frac{n}{2^{-k}p}\right) \log \left( \frac{{\bar{\zeta }}}{2^{-(k-1)}}\right) \right) \\&\quad = {\tilde{O}}\left( k^{*3} N\sqrt{\frac{{\bar{\zeta }}}{2^{-k^*}}} \log \left( \frac{n}{p}\right) \log \left( {\bar{\zeta }}\right) \right) \\&\quad = {\tilde{O}}\left( N\sqrt{\frac{{\bar{\zeta }}}{\xi ^*}} \log \left( \frac{n}{p}\right) \log \left( {\bar{\zeta }}\right) \log \left( \frac{1}{\xi ^*}\right) ^3\right) . \end{aligned}$$

$\square $

1.2 Computing $\zeta $

Recall the guarantee of the algorithm ApproxGammaPlus.

Lemma 12 Given $q_0$, $q_1$ satisfying Assumption 4, $(\xi ,\zeta )$ and $\hat{\gamma }$ satisfying Assumption 5, $\delta >0$, and $p_{\tilde{\gamma }_+}$, ApproxGammaPlus (Algorithm 2) outputs $\tilde{\gamma }_+$ satisfying

$$\begin{aligned} \tilde{\gamma }_+\in [\gamma _+-\delta ,\;\gamma _+], \lambda _{\min }(A(\tilde{\gamma }_+))\le \delta /\kappa \end{aligned}$$

with probability $1-p_{\tilde{\gamma }_+}$. This algorithm runs in time

$$\begin{aligned} \tilde{O}\left( \frac{N\sqrt{\kappa \zeta }}{\sqrt{\delta }}\log \left( \frac{n}{p_{\tilde{\gamma }_+}}\right) \log \left( \frac{\kappa }{\delta }\right) \right) . \end{aligned}$$

We will repeatedly call ApproxGammaPlus with different choices of $\delta $. Consider the algorithm ApproxZeta.

Theorem 9

Given $q_0,q_1$ satisfying Assumption 4, $(\xi ,{\bar{\zeta }})$ and ${\hat{\gamma }}$ satisfying Assumption 5, and failure probability $p>0$, ApproxZeta (Algorithm 7) will output $\zeta $ such that

$$\begin{aligned} \zeta ^*\le \zeta \le 4\zeta ^* \end{aligned}$$

and run in time

$$\begin{aligned} {\tilde{O}}\left( \frac{N\sqrt{\zeta ^*}}{\sqrt{\xi }}\log \left( \frac{n}{p}\right) \log \left( \frac{1}{\xi }\right) \log \left( \frac{{\bar{\zeta }}}{\zeta ^*}\right) ^2\right) \end{aligned}$$

with probability $1-p$.

Proof

We condition on the event that ApproxGammaPlus succeeds every time it is called. By the union bound, this happens with probability at least $1-p$.

We first check that the assumptions of Lemma 12 hold. For $k = 1$, we have $2^{-(k-1)}{\bar{\zeta }} = {\bar{\zeta }}\ge \zeta ^*$. Then by induction, and conditioning on ApproxGammaPlus succeeding, Lemma 12 guarantees

$$\begin{aligned} \zeta ^* \le {\hat{\zeta _k}} + 2^{-(k+1)}{\bar{\zeta }}. \end{aligned}$$

If ApproxZeta fails to terminate in round k, then 1.(b) ensures ${\hat{\zeta _k}}\le 2^{-(k+1)}{\bar{\zeta }}$. This in turn implies that $\zeta ^* \le 2^{-((k+1)-1)}{\bar{\zeta }}$ and, by induction, the assumptions of Lemma 12 hold in every round that ApproxGammaPlus is called.

Let k be the round in which the algorithm terminates. If $k = 1$, then the guarantee of Lemma 12 implies $\zeta ^*\ge {\hat{\zeta }}_{1}$, whence

$$\begin{aligned} {\bar{\zeta }}\ge \zeta ^*\ge {\hat{\zeta }_{1}} > \frac{1}{4}{\bar{\zeta }}. \end{aligned}$$

Thus, we may assume $k\ge 2$. The condition of step 1.(b) then guarantees the two inequalities

$$\begin{aligned} {\hat{\zeta }_{k-1}} \le 2^{-k}{\bar{\zeta }}, \,\text { and }\, {\hat{\zeta _k}} > 2^{-(k+1)}{\bar{\zeta }}. \end{aligned}$$

(14)

Then, we have

$$\begin{aligned} \zeta ^*&\ge {\hat{\zeta }_{k}} > 2^{-(k+1)}{\bar{\zeta }} = \frac{1}{4}\left( 2^{-k}{\bar{\zeta }} + 2^{-k}{\bar{\zeta }}\right) \ge \frac{1}{4} \left( {\hat{\zeta }_{k-1}} +2^{-((k-1)+1)}{\bar{\zeta }}\right) \ge \zeta ^*/4 \end{aligned}$$

where the first and fifth relations follow from Lemma 12 and the second and fourth relations follow from (14) above.

It remains to bound the run time of ApproxZeta. Let $k^*\in \left\{ 1,2,\dots \right\} $ be such that $\zeta ^* \le 2^{-(k^*-1)}{\bar{\zeta }} < 2\zeta ^*$. We show that ApproxZeta terminates within $k^*$ rounds. Suppose ApproxZeta reaches the $k^*$th round. Then, we have

$$\begin{aligned} {\hat{\zeta }_{k^*}}&\ge \zeta ^* - 2^{-(k^*+1)}{\bar{\zeta }} > 2^{-k^*}{\bar{\zeta }} - 2^{-(k^*+1)}{\bar{\zeta }} = 2^{-(k^* +1)}{\bar{\zeta }}, \end{aligned}$$

where we used Lemma 12 in the first relation, and the definition of $k^*$ in the second relation. Therefore, ApproxZeta terminates in round $k^*$ at the latest and we can bound the run time of this algorithm as

$$\begin{aligned}&\sum _{k=1}^{k^*} {\tilde{O}}\left( \frac{2^{-(k-1)}N{\bar{\zeta }}}{\sqrt{2^{-(k+1)}\xi {\bar{\zeta }}}}\log \left( \frac{n}{2^{-k}p}\right) \log \left( \frac{2^{-(k-1)}{\bar{\zeta }}}{2^{-(k+1)}\xi {\bar{\zeta }}}\right) \right) \\&\quad ={\tilde{O}}\left( k^{*2}\frac{N\sqrt{2^{-k^*}{\bar{\zeta }}}}{\sqrt{\xi }}\log \left( \frac{n}{p}\right) \log \left( \frac{1}{\xi }\right) \right) \\&\quad =\tilde{O}\left( \frac{N\sqrt{\zeta ^*}}{\sqrt{\xi }}\log \left( \frac{n}{p}\right) \log \left( \frac{1}{\xi }\right) \log \left( \frac{{\bar{\zeta }}}{\zeta ^*}\right) ^2\right) . \end{aligned}$$

$\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, A.L., Kılınç-Karzan, F. The generalized trust region subproblem: solution complexity and convex hull results. Math. Program. 191, 445–486 (2022). https://doi.org/10.1007/s10107-020-01560-8

Download citation

Received: 18 July 2019
Accepted: 01 September 2020
Published: 10 October 2020
Issue Date: February 2022
DOI: https://doi.org/10.1007/s10107-020-01560-8

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The generalized trust region subproblem: solution complexity and convex hull results

Abstract

Access this article

Similar content being viewed by others

Local nonglobal minima for solving large-scale extended trust-region subproblems

A matrix-free trust-region newton algorithm for convex-constrained optimization

Solving trust region subproblems using Riemannian optimization

Notes

References