Using nonlinear functions to approximate a new quasi-Newton method for unconstrained optimization problems

Dehghani, R.; Bidabadi, N.; Hosseini, M. M.

doi:10.1007/s11075-020-00986-7

Using nonlinear functions to approximate a new quasi-Newton method for unconstrained optimization problems

Original Paper
Published: 03 August 2020

Volume 87, pages 755–777, (2021)
Cite this article

Numerical Algorithms Aims and scope Submit manuscript

R. Dehghani¹,
N. Bidabadi¹ &
M. M. Hosseini²

331 Accesses
Explore all metrics

Abstract

In order to get a higher order accuracy of approximating the Hessian matrix of the objective function, we use the chain rule and propose two modified secant equations. An interesting property of the proposed methods is that these utilize information from two most recent steps where the usual secant equation uses only the latest step information. The other point of interest to one of the proposed methods is that it makes use of both gradient and function value information. We show that the modified BFGS methods based on the new secant equations are globally convergent. The presented experimental results illustrate that the proposed methods are efficient.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A modified secant equation quasi-Newton method for unconstrained optimization

Article 31 May 2022

Basim A. Hassan & Issam A. R. Moghrabi

Two-step conjugate gradient method for unconstrained optimization

Article 11 August 2020

R. Dehghani & N. Bidabadi

A new modified BFGS method for unconstrained optimization problems

Article 19 April 2018

Razieh Dehghani, Narges Bidabadi & Mohammad Mehdi Hosseini

References

Broyden, C. G.: A class of methods for solving nonlinear simultaneous equations. Math. Comp. 19, 577–593 (1965)
Article MathSciNet Google Scholar
Broyden, C. G.: The convergence of a class of double-rank minimization algorithms: 2. new algorithm. IMA J. Appl. Math. 6(3), 222–231 (1970)
Article MathSciNet Google Scholar
Byrd, R., Nocedal, J.: A tool for the analysis of quasi-Newton methods with application to unconstrained minimization. SIAM J. Numer. Anal. 26, 727–739 (1989)
Article MathSciNet Google Scholar
Byrd, R., Nocedal, J., Yuan, Y.: Global convergence of a class of quasi-Newton methods on convex problems. SIAM J. Numer. Anal. 24, 1171–1189 (1987)
Article MathSciNet Google Scholar
Dai, Y.: Convergence properties of the BFGS algorithm. SIAM J. Optim. 13, 693–701 (2003)
Article MathSciNet Google Scholar
Dolan, E., Moré, J. J.: Benchmarking optimization software with performance profiles. Math. Program. 91, 201–213 (2002)
Article MathSciNet Google Scholar
Ford, J. A., Moghrabi, I. A.: Multi-step quasi-Newton methods for optimization. J. Comput. Appl. Math. 50, 305–323 (1994)
Article MathSciNet Google Scholar
Ford, J. A., Moghrabi, I. A.: Alternating multi-step quasi-Newton methods for unconstrained optimization. J. Comput. Appl. Math. 82, 105–116 (1997)
Article MathSciNet Google Scholar
Ford, J. A., Saadallah, A. F.: On the construction of minimisation methods of quasi-Newton type. J. Comput. Appl. Math. 20, 239–246 (1987)
Article MathSciNet Google Scholar
Gould, N. I. M., Orban, D., Toint, Ph.L.: CUTEr, A constrained and unconstrained testing environment, revisited. ACM Trans. Math. Softw. 29, 373–394 (2003)
Article Google Scholar
Griewank, A.: The global convergence of partioned BFGS on problems with convex decompositons and Lipschitzian gradients. Math. Program. 50, 141–175 (1991)
Article Google Scholar
Li, D. H., Fukushima, M.: On the global convergence of BFGS method for nonconvex unconstrained optimization problems. SIAM J. Optim. 11, 1054–1064 (2001)
Article MathSciNet Google Scholar
Li, D. H., Fukushima, M.: A modified BFGS method and its global convergence in nonconvex minimization. J. Comput. Appl. Math. 129, 15–35 (2001)
Article MathSciNet Google Scholar
Mascarenhas, W. F.: The BFGS method with exact line searches fails for non-convex objective functions. Math. Program. 99, 49–61 (2004)
Article MathSciNet Google Scholar
Moghrabi, I. A.: Multi-Step quasi-Newton Method for Unconstrained Optimization. Ph.D. thesis, Depart. Comput. Sci. Univ. Essex (1993)
Moghrabi, I. A.: Exploiting function values in multi-step method. Int. J. Pure Appl. Math. 28(2), 187–196 (2006)
MathSciNet MATH Google Scholar
Moghrabi, I. A., Ford, J. A.: A nonlinear model for function-value multistep methods. Comput. Math. Appl. 42, 1157–1164 (2001)
Article MathSciNet Google Scholar
Nocedal, J., Wright, S. J.: Numerical Optimization. Springer, New York (2006)
MATH Google Scholar
Pearson, J. D.: Variable metric methods of minimization. Comp. J. 12, 171–178 (1969)
Article Google Scholar
Powell, M. J. D.: Some global convergence properties of a variable metric algorithm for minimization without exact line searches. In: Cottle, R.W., Lemke, C.E. (eds.) Nonlinear Programming, SIAM-AMS Proceedings (SIAM publications), vol. 9, pp 53–72 (1976)
Toint, Ph.L.: Global convergence of the partioned BFGS algorithm for convex partially separable opertimization. Math. Program. 36, 290–306 (1986)
Article Google Scholar
Wei, Z., Li, G., Qi, L.: New quasi-Newton methods for unconstrained optimization problems. Appl. Math. Comput. 175, 1156–1188 (2006)
MathSciNet MATH Google Scholar
Yuan, G., Wei, Z.: Convergence analysis of a modified BFGS method on convex minimizations. Comput. Optim. Appl. 47, 237–255 (2010)
Article MathSciNet Google Scholar
Zhang, J. Z., Deng, N. Y., Chen, L. H.: New quasi-Newton equation and related methods for unconstrained optimization. J. Optim. Theory Appl. 102, 147–167 (1999)
Article MathSciNet Google Scholar
Zhang, J. Z., Xu, C. X.: Properties and numerical performance of quasi-Newton methods with modified quasi-Newton equation. J. Comput. Appl. Math. 137, 269–278 (2001)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematical Sciences, Yazd University, Yazd, Iran
R. Dehghani & N. Bidabadi
Department of Mathematics, Faculty of Mathematics and Computer, Shahid Bahonar University of Kerman, Kerman, Iran
M. M. Hosseini

Authors

R. Dehghani
View author publications
You can also search for this author in PubMed Google Scholar
N. Bidabadi
View author publications
You can also search for this author in PubMed Google Scholar
M. M. Hosseini
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to N. Bidabadi.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Here, in Table 1, the column headings have the following meanings:

n::: number of variables of the problem
No.::: problem number

Table 1 Test problems taken from CUTEr library

Full size table

Note 1

Assume that L_j, j = 0, 1, 2 and $ \frac {dX(t)}{dt} $ defined by (9) and (18), respectively, then we have

$$ \begin{array}{lll} {\int}_{-1}^{1} L_{2}(t) \frac{dX(t)}{dt}dt& =&{\int}_{-1}^{1} \frac{t(t+1)}{2} \left( \frac{(1+2t)}{2}s_{k}+\frac{(1-2t)}{2}s_{k-1}\right)dt \\ \\ & &={\int}_{-1}^{1} \frac{t(t+1)(1+2t)}{4}s_{k}dt+{\int}_{-1}^{1} \frac{t(t+1)(1-2t)}{4}s_{k-1}dt \\ \\ & &=\frac{1}{2}(s_{k}-\frac{1}{3}s_{k-1}), \end{array} $$

$$ \begin{array}{lll} {\int}_{-1}^{1} L_{1}(t) \frac{dX(t)}{dt}dt& =&{\int}_{-1}^{1} (1-t^{2}) \left( \frac{(1+2t)}{2}s_{k}+\frac{(1-2t)}{2}s_{k-1}\right)dt \\ \\ & &={\int}_{-1}^{1} \frac{(1-t^{2})(1+2t)}{2}s_{k}dt+{\int}_{-1}^{1} \frac{(1-t^{2})(1-2t)}{2}s_{k-1}dt \\ \\ & &=\frac{2}{3}(s_{k}+s_{k-1}), \end{array} $$

and

$$ \begin{array}{lll} {\int}_{-1}^{1} L_{0}(t) \frac{dX(t)}{dt}dt& =&{\int}_{-1}^{1} (\frac{t(t-1)}{2}) \left( \frac{(1+2t)}{2}s_{k}+\frac{(1-2t)}{2}s_{k-1}\right)dt \\ \\ & &={\int}_{-1}^{1} \frac{t(t-1)(1+2t)}{4}s_{k}dt+{\int}_{-1}^{1} \frac{t(t-1)(1-2t)}{4}s_{k-1}dt \\ \\ & &=\frac{1}{2}(-\frac{1}{3}s_{k}+s_{k-1}). \end{array} $$

Note 2

Assume that L_j, j = 0, 1, 2 and $ \frac {dx(t,\theta )}{dt} $ defined by (9) and (11), respectively, then we have

$$ \begin{array}{lll} {\int}_{-1}^{1} L_{0}(t) \frac{dX(t)}{dt}dt& =&{\int}_{-1}^{1} \frac{t(t-1)}{2} \left( \frac{(1+2t)}{2{\cos}h(\theta)}x_{k+1}-2tx_{k}+\frac{(2t-1)}{2{\cos}h(\theta)}x_{k-1}\right) {\cos}h(t\theta)dt \\ \\ & &+{\int}_{-1}^{1} \frac{t(t-1)}{2} \left( \frac{t(t+1)}{2{\cos}h(\theta)}x_{k+1}+(1-t^{2})x_{k}+\frac{t(t-1)}{2{\cos}h(\theta)}x_{k-1}\right)\theta {\sin}h(t\theta)dt \\ \\ &=& \left[ \frac{-1}{2\theta}{\tan}h(\theta)+\frac{1}{\theta^{2}}-\frac{1}{\theta^{3}}{\tan}h(\theta)\right] s_{k} \\ \\ &&+ \left[ \frac{-1}{2\theta}{\tan}h(\theta)+\frac{1}{\theta^{2}}-\frac{1}{\theta^{3}}{\tan}h(\theta)+\frac{2}{\theta^{2}}{\cos}h(\theta)-\frac{2}{\theta^{3}}{\sin}h(\theta)\right] s_{k-1} \\ \\ & & +\left[ -1+ \frac{1}{\theta}{\tan}h(\theta)-\frac{2}{\theta^{2}}-\frac{2}{\theta^{3}}{\tan}h(\theta)+\frac{2}{\theta^{2}}{\cos}h(\theta)-\frac{2}{\theta^{3}}{\sin}h(\theta)\right] x_{k-1}. \end{array} $$

Similarly

$$ \begin{array}{l} {\int}_{-1}^{1} L_{1}(t) \frac{dX(t)}{dt}dt= \left[ \frac{1}{\theta}{\tan}h(\theta)-\frac{2}{\theta^{2}}+\frac{2}{\theta^{3}}{\tan}h(\theta)\right] s_{k}\\ \\ + \left[ \frac{1}{\theta}{\tan}h(\theta)-\frac{2}{\theta^{2}}+\frac{2}{\theta^{3}}{\tan}h(\theta)-2{\cos}h(\theta)+\frac{12}{\theta}{\sin}h(\theta)-\frac{24}{\theta^{2}}{\cos}h(\theta)+\frac{24}{\theta^{3}}{\sin}h(\theta)\right] s_{k-1}\\ \\ + \left[ \frac{-1}{\theta}{\tan}h(\theta)+\frac{2}{\theta^{2}}-\frac{2}{\theta^{3}}{\tan}h(\theta)-2{\cos}h(\theta)+\frac{12}{\theta}{\sin}h(\theta)-\frac{24}{\theta^{2}}{\cos}h(\theta)+\frac{24}{\theta^{3}}{\sin}h(\theta)\right] x_{k-1} \end{array} $$

and

$$ \begin{array}{l} {\int}_{-1}^{1} L_{2}(t) \frac{dX(t)}{dt}dt =\left[ 1-\frac{3}{2\theta}{\tan}h(\theta)+\frac{3}{\theta^{2}}-\frac{3}{\theta^{3}}{\tan}h(\theta)\right] s_{k}\\ \\ ~~~~~~~+ \left[ 1-\frac{3}{2\theta}{\tan}h(\theta)+\frac{3}{\theta^{2}}-\frac{3}{\theta^{3}}{\tan}h(\theta)-\frac{2}{\theta^{2}}{\cos}h(\theta)+\frac{2}{\theta^{3}}{\sin}h(\theta)\right] s_{k-1}\\ \\ ~~~~~~~~+ \left[ 1-\frac{1}{\theta}{\tan}h(\theta)+\frac{2}{\theta^{2}}-\frac{2}{\theta^{3}}{\tan}h(\theta)-\frac{2}{\theta^{2}}{\cos}h(\theta)+\frac{2}{\theta^{3}}{\sin}h(\theta)\right] x_{k-1} \end{array} $$

Proof of Lemma 2:

Proof Let ξ_k denote the angle between $s^{\ast }_{k}$ and $B_{k} s^{\ast }_{k}$, that is,

$$ \cos \xi_{k} = \frac{{s^{\ast}_{k}}^{T}B_{k} s^{\ast}_{k}}{\Vert s^{\ast}_{k}\Vert \Vert B_{k} s^{\ast}_{k}\Vert}, $$

(57)

and

$$ q_{k} = \frac{{s^{\ast}_{k}}^{T}B_{k} s^{\ast}_{k}}{ {s^{\ast}_{k}}^{T}s^{\ast}_{k}}. $$

(58)

By taking the trace of both sides of (3), we obtain

$$ Tr (B_{k+1})=Tr(B_{k})-\frac{\Vert B_{k} s^{\ast}_{k}\Vert^{2}}{s^{\ast}_{k} B_{k} s^{\ast}_{k}}+\frac{\Vert y_{k}^{\ast} \Vert^{2}}{{y_{k}^{\ast}}^{T} s^{\ast}_{k}}. $$

(59)

where Tr(B) denotes the trace of B. On the other hand we have (see [19]):

$$ Det(B_{k+1})=Det(B_{k})\frac{{s^{\ast}_{k}}^{T} y_{k}^{\ast} }{{s^{\ast}_{k}}^{T} B_{k} s^{\ast}_{k} }, $$

(60)

where Det(B) is the determinant of B. Let

$$ \psi(B)=Tr(B)-\ln(Det(B)), $$

(61)

where B is a positive definite matrix. Using (59), (60) and (61), we obtain

$$\psi(B_{k+1})=\psi(B_{k})-\frac{\Vert B_{k} s^{\ast}_{k}\Vert^{2}}{s^{\ast}_{k} B_{k} s^{\ast}_{k}}+\frac{\Vert y_{k}^{\ast} \Vert^{2}}{{y_{k}^{\ast} }^{T} s^{\ast}_{k}}-\ln(\frac{{s^{\ast}_{k}}^{T} y_{k}^{\ast} }{{s^{\ast}_{k}}^{T} B_{k} s^{\ast}_{k} })$$

$$ =\psi(B_{k})-[\frac{\Vert B_{k} s^{\ast}_{k}\Vert \Vert s_{k}\Vert}{s^{\ast}_{k} B_{k} s^{\ast}_{k}}]^{2} \frac{s^{\ast}_{k} B_{k} s^{\ast}_{k}}{{s^{\ast}_{k}}^{T} s^{\ast}_{k}}+\frac{\Vert y_{k}^{\ast} \Vert^{2}}{{y_{k}^{\ast} }^{T} s^{\ast}_{k}}-\ln(\frac{{s^{\ast}_{k}}^{T} y_{k}^{\ast} }{{s^{\ast}_{k}}^{T} s_{k}^{\ast}}\frac{{s^{\ast}_{k}}^{T} s_{k}^{\ast}}{{s^{\ast}_{k}}^{T} B_{k} s^{\ast}_{k} }). $$

(62)

From the definitions (57) and (58), the relation (62) can be written as

$$\psi(B_{k+1})=\psi(B_{k})+\frac{\Vert y_{k}^{\ast} \Vert^{2}}{{y_{k}^{\ast}}^{T} s^{\ast}_{k}}-\ln(\frac{{s^{\ast}_{k}}^{T} y_{k}^{\ast} }{{s^{\ast}_{k}}^{T} s_{k}^{\ast}})-\frac{q_{k}}{ \cos^{2} \xi_{k}}+\ln(q_{k})$$

$$=\psi(B_{k})+\frac{\Vert y_{k}^{\ast} \Vert^{2}}{{y_{k}^{\ast} }^{T} s^{\ast}_{k}}-1-\ln(\frac{{s^{\ast}_{k}}^{T} y_{k}^{\ast} }{{s^{\ast}_{k}}^{T} s_{k}^{\ast}})+\ln (\cos^{2} \xi_{k})$$

$$ +[1-\frac{q_{k}}{ \cos^{2} \xi_{k}}+\ln (\frac{q_{k}}{ \cos^{2} \xi_{k}})]. $$

(63)

From (49) and (63), we have

$$\psi(B_{k+1})\leq\psi(B_{1})+(M-1-\ln(m))k$$

$$ +{\sum}_{j=1}^{k} \left( \ln(\cos^{2} \xi_{j})+1-\frac{q_{j}}{ \cos^{2} \xi_{j}}+\ln (\frac{q_{j}}{ \cos^{2} \xi_{j}})\right). $$

(64)

On the other hand, from definition of ψ(B), we have

$$ \psi(B)= {\sum}_{i=1}^{n}(\lambda_{i}-\ln(\lambda_{i})), $$

where 0 < λ₁ ≤ λ₂ ≤ ... ≤ λ_n are the eigenvalues of B. Therefore we have

$$ \psi(B)>0. $$

This, together with (64), result in

$$ \frac{1}{k} {\sum}_{j=1}^{k} \eta_{j}\leq {\psi(B_{1})}+(M-1-\ln(m)) . $$

(65)

where

$$ \eta_{j}=- \ln(\cos^{2} \xi_{j}) -\left[1-\frac{q_{j}}{ \cos^{2} \xi_{j}}+\ln (\frac{q_{j}}{ \cos^{2} \xi_{j}})\right]. $$

(66)

Note that the function

$$ u(t)=1-t+\ln (t), $$

is nonpositive for all t > 0, therefore we have η_j ≥ 0, ∀j. Let us now define J_k to be a set consisting of the ⌈pk⌉ indices corresponding to the ⌈pk⌉ smallest values of η_j, for j ≤ k, and let $\eta _{m_{k}}$ denote the largest of the η_j for j ∈ J_k. Then

$$ \begin{array}{ll} \frac{1}{k} {\sum}_{j=1}^{k} \eta_{j} & =\frac{1}{k}\left[{\sum}_{j=1, j\in J_{k}}^{k} \eta_{j}+ {\sum}_{j=1, j\notin J_{k}}^{k} \eta_{j}\right]\geq\frac{1}{k}\left[\eta_{m_{k}}+ {\sum}_{j=1, j\notin J_{k}}^{k} \eta_{j}\right]\\ & \geq\frac{1}{k}\left[\eta_{m_{k}}+ {\sum}_{j=1, j\notin J_{k}}^{k} \eta_{m_{k}}\right]=\frac{\eta_{m_{k}}}{k}\left[1+ {\sum}_{j=1, j\notin J_{k}}^{k} 1\right]\\ & =\frac{\eta_{m_{k}}}{k}(k+1-\lceil pk\rceil) =\eta_{m_{k}}(1-\frac{\lfloor pk\rfloor}{k})\geq \eta_{m_{k}} (1-p). \end{array} $$

This together with (65) yields

$$ \eta_{j}\leq\eta_{m_{k}}\leq\frac{1}{(1-p)}[\psi(B_{1})+M-1-\ln(m)]=\beta_{0},~~\forall j\in J_{k}. $$

(67)

From the fact that the term inside the brackets in (66) is less than or equal to zero, using (66) and (67) we conclude that,

$$ -\ln(\cos^{2} \xi_{j})<\beta_{0},~~\forall j\in J_{k}. $$

(68)

Therefore,

$$ \cos \xi_{j}>e^{-\beta_{0}/2}\equiv\beta_{1},~~\forall j\in J_{k}. $$

(69)

In addition, using (66) and (68), we get

$$ 1-\frac{q_{j}}{ \cos^{2} \xi_{j}}+\ln (\frac{q_{j}}{ \cos^{2} \xi_{j}})\geq - \beta_{0},~~\forall j\in J_{k}. $$

(70)

Since, the function u(t) achieves its maximum value at t = 1, and satisfies $u(t)\rightarrow -\infty $ both as $t\rightarrow 0$ and as $t\rightarrow \infty $. Therefore, it follows that

$$ 0< \tilde{\beta}_{2}\leq\frac{q_{j}}{ \cos^{2} \xi_{j}}\leq \beta_{3},~~\forall j\in J_{k}, $$

(71)

where $\tilde {\beta }_{2}$ and β₃ are positive constants.

Now, using (69) and (71), we obtain

$$ {\beta^{2}_{1}}\tilde{\beta}_{2}\leq q_{j} \leq \beta_{3}. $$

(72)

Since,

$$ \frac{\Vert B_{j}s^{\ast}_{j} \Vert}{\Vert s^{\ast}_{j}\Vert}=\frac{q_{j}}{\cos \xi_{j}}, $$

we obtain from (69) and (72)

$$ \frac{\Vert B_{j}s^{\ast}_{j} \Vert}{\Vert s^{\ast}_{j} \Vert}\leq \frac{\beta_{3}}{\beta_{1}},~~\forall j\in J_{k}. $$

(73)

The relations (72) and (73) imply (50) with $a_{1}= \frac {\beta _{3}}{\beta _{1}},$ $a_{2}={\beta ^{2}_{1}}\tilde {\beta }_{2}$ and a₃ = β₃. □

Proof of Lemma 3:

Proof The relation (44) implies

$$ {g_{k-1}^{T}d_{k-1}}\leq -c\Vert g_{k-1}\Vert^{2}, ~~\forall k>0, $$

Therefore

$$ \Vert g_{k-1}\Vert \Vert d_{k-1}\Vert\geq{\vert g_{k-1}^{T}d_{k-1}\vert}>c \Vert g_{k-1}\Vert^{2}, ~~\forall k>0, $$

equivalently,

$$ \Vert s_{k-1}\Vert > c{\alpha_{k-1}}\Vert g_{k-1}\Vert, ~~\forall k>0. $$

(74)

On the other hand, for w_k defined by (33), we have

$$ \begin{array}{lll} \vert w_{k}\vert \!&\leq &\! \Vert s_{k-1}\Vert \Vert y_{k}\Vert+\Vert s_{k-1}\Vert\Vert y_{k-1}\Vert +\left\vert(\frac{1}{\theta}+\frac{2}{\theta^{3}}){\tan}h(\theta)-\frac{2}{\theta^{2}}\right\vert \alpha_{k} \Vert s_{k-1}\Vert\Vert g_{k}\Vert \\ &+&\!\left\vert-\frac{2}{\theta^{2}} + (\frac{1}{\theta} + \frac{2}{\theta^{3}}){\tan}h(\theta) - (2 + \frac{24}{\theta^{2}}){\cos}h(\theta) + (\frac{12}{\theta} + \frac{24}{\theta^{3}}){\sin}h(\theta)\right\vert \Vert s_{k-1}\Vert\Vert y_{k-1}\Vert \\ &+&\! \left\vert \frac{2}{\theta^{2}} - (\frac{1}{\theta} + \frac{2}{\theta^{3}}){\tan}h(\theta) - (2 + \frac{24}{\theta^{2}}){\cos}h(\theta) + (\frac{12}{\theta}+\frac{24}{\theta^{3}}){\sin}h(\theta)\right\vert \Vert y_{k-1}\Vert \Vert x_{k-1}\Vert \\ &+&\! \left\vert \frac{1}{\theta^{2}}-(\frac{1}{2\theta}+\frac{1}{\theta^{3}}){\tan}h(\theta)\right\vert \alpha_{k-1}\Vert g_{k-1}\Vert\Vert s_{k}\Vert \\ &+&\! \left\vert\frac{1}{\theta^{2}}-(\frac{1}{2\theta}+\frac{1}{\theta^{3}}){\tan}h(\theta)+\frac{2}{\theta^{2}}{\cos}h(\theta)-\frac{2}{\theta^{3}}{\sin}h(\theta)\right\vert \alpha_{k-1}\Vert s_{k-1}\Vert\Vert g_{k-1}\Vert \\ &+&\!\left\vert -1 - \frac{2}{\theta^{2}}+ (\frac{1}{\theta} - 1\frac{2}{\theta^{3}}){\tan}h(\theta)+ \frac{2}{\theta^{2}}{\cos}h(\theta) - \frac{2}{\theta^{3}}{\sin}h(\theta)\right\vert \alpha_{k-1}\Vert g_{k-1}\Vert \Vert x_{k-1}\Vert \end{array} $$

From this fact |θ| > μ₁ we can deduce that there exist positive constants κ₁, κ₂,..., κ₆ such that

$$ \begin{array}{lll} \vert w_{k}\vert &\leq & \Vert s_{k-1}\Vert \Vert y_{k}\Vert+\Vert s_{k-1}\Vert\Vert y_{k-1}\Vert +\kappa_{1}\alpha_{k} \Vert s_{k-1}\Vert\Vert g_{k}\Vert +\kappa_{2} \Vert s_{k-1}\Vert\Vert y_{k-1}\Vert \\&&+ \kappa_{3} \Vert y_{k-1}\Vert \Vert x_{k-1}\Vert \\ \\ &+& \kappa_{4} \alpha_{k-1}\Vert g_{k-1}\Vert\Vert s_{k}\Vert +\kappa_{5}\alpha_{k-1}\Vert s_{k-1}\Vert\Vert g_{k-1}\Vert +\kappa_{6}\alpha_{k-1}\Vert g_{k-1}\Vert \Vert x_{k-1}\Vert \\ \\ &\leq &L \Vert s_{k-1}\Vert \Vert s_{k}\Vert+L\Vert s_{k-1}\Vert^{2} +\kappa_{1}\alpha_{k}\Vert s_{k-1}\Vert g_{k}\Vert +\kappa_{2} L\Vert s_{k-1}\Vert^{2} \\&&+ \kappa_{3} L \Vert s_{k-1}\Vert \Vert x_{k-1}\Vert \\ \\ &+& \kappa_{4} \alpha_{k-1}\Vert g_{k-1}\Vert\Vert s_{k}\Vert +\kappa_{5}\alpha_{k-1}\Vert s_{k-1}\Vert\Vert g_{k-1}\Vert +\kappa_{6}\alpha_{k-1}\Vert g_{k-1}\Vert \Vert x_{k-1}\Vert . \end{array} $$

Hence,

$$\begin{array}{lll} \vert w_{k}\vert &\leq &(3L\tau+\kappa_{1}\alpha_{k} \gamma+2\kappa_{2} L\tau+ \kappa_{3} L\tau+ \kappa_{5}\gamma L)\Vert s_{k-1}\Vert \\ && + \kappa_{4} \alpha_{k-1}\Vert g_{k-1}\Vert\Vert s_{k}\Vert +\kappa_{6}\alpha_{k-1}\Vert g_{k-1}\Vert \Vert x_{k-1}\Vert , \end{array}$$

where γ and τ are the same as that given in (41) and (43), respectively. Thus from the definition of $ y_{k}^{\ast } $ and relation (74) we have

$$\begin{array}{lll} \Vert y_{k}^{\ast}\Vert & \leq & \frac{\vert w_{k}\vert}{\Vert s_{k-1}\Vert} \\ & \leq & (3L\tau+\kappa_{1}\alpha_{k} \gamma+2\kappa_{2} L\tau+ \kappa_{3} L\tau+ \kappa_{5}\gamma L) + \kappa_{4}\Vert s_{k}\Vert \frac{ \alpha_{k-1}\Vert g_{k-1}\Vert}{\Vert s_{k-1}\Vert} \\&&+ \kappa_{6}\Vert x_{k-1}\Vert \frac{ \alpha_{k-1}\Vert g_{k-1}\Vert}{\Vert s_{k-1}\Vert} \\ & \leq & (3L\tau+\kappa_{1}\alpha_{k} \gamma+2\kappa_{2} L\tau+ \kappa_{3} L\tau+ \kappa_{5}\gamma L) + \frac{ \kappa_{4}}{c}\Vert s_{k}\Vert +\frac{ \kappa_{6}}{c} \Vert x_{k-1}\Vert \\ & \leq & (3L\tau+\kappa_{1}\alpha_{k} \gamma+2\kappa_{2} L\tau+ \kappa_{3} L\tau+ \kappa_{5}\gamma L) + 2\tau\frac{ \kappa_{4}}{c} +\tau\frac{ \kappa_{6}}{c}. \end{array}$$

Since α_k is bounded, this completes the proof. □

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dehghani, R., Bidabadi, N. & Hosseini, M.M. Using nonlinear functions to approximate a new quasi-Newton method for unconstrained optimization problems. Numer Algor 87, 755–777 (2021). https://doi.org/10.1007/s11075-020-00986-7

Download citation

Received: 27 April 2019
Accepted: 16 July 2020
Published: 03 August 2020
Issue Date: June 2021
DOI: https://doi.org/10.1007/s11075-020-00986-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Using nonlinear functions to approximate a new quasi-Newton method for unconstrained optimization problems

Abstract

Access this article

Similar content being viewed by others

A modified secant equation quasi-Newton method for unconstrained optimization

Two-step conjugate gradient method for unconstrained optimization

A new modified BFGS method for unconstrained optimization problems

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Appendix

Note 1

Note 2

Proof of Lemma 2:

Proof of Lemma 3:

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Using nonlinear functions to approximate a new quasi-Newton method for unconstrained optimization problems

Abstract

Access this article

Similar content being viewed by others

A modified secant equation quasi-Newton method for unconstrained optimization

Two-step conjugate gradient method for unconstrained optimization

A new modified BFGS method for unconstrained optimization problems

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Appendix

Appendix

Note 1

Note 2

Proof of Lemma 2:

Proof of Lemma 3:

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation