Skip to main content
Log in

Stochastic optimization with adaptive restart: a framework for integrated local and global learning

  • Published:
Journal of Global Optimization Aims and scope Submit manuscript

Abstract

A common approach to global optimization is to combine local optimization methods with random restarts. Restarts have been used as a performance boosting approach. They can be a means to avoid “slow progress” by exploiting a potentially good solution, and restarts can enable the potential discovery of multiple local solutions, thus improving the overall quality of the returned solution. A multi-start method is a way to integrate local and global approaches; where the global search itself can be used to restart a local search. Bayesian optimization methods aim to find global optima of functions that can only be point-wise evaluated by means of a possibly expensive oracle. We propose the stochastic optimization with adaptive restart (SOAR) framework, that uses the predictive capability of Gaussian process models as a means to adaptively restart local search and intelligently select restart locations with current information. This approach attempts to balance exploitation with exploration of the solution space. We study the asymptotic convergence of SOAR to a global optimum, and empirically evaluate SOAR performance through a specific implementation that uses the Trust Region method as the local search component. Numerical experiments show that the proposed algorithm outperforms existing methodologies over a suite of test problems of varying problem dimension with a finite budget of function evaluations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Ankenman, B., Nelson, B.L., Staum, J.: Stochastic Kriging for simulation metamodeling. Oper. Res. 58(2), 371–382 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  2. Atkinson, A.: A segmented algorithm for simulated annealing. Stat. Comput. 31, 635–672 (1992)

    Google Scholar 

  3. Betrò, B., Schoen, F.: Sequential stopping rules for the multistart algorithm in global optimisation. Math. Prog. 38(3), 271–286 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  4. Betrò, B., Schoen, F.: A stochastic technique for global optimization. Comput. Math. Appl. 21(6–7), 127–133 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  5. Betrò, B., Schoen, F.: Optimal and sub-optimal stopping rules for the multistart algorithm in global optimization. Math. Prog. 57(1–3), 445–458 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  6. Bouhlel, M.A., Bartoli, N., Regis, R.G., Otsmane, A., Morlier, J.: Efficient global optimization for high-dimensional constrained problems by using the Kriging models combined with the partial least squares method. Eng. Optim. 50(12), 2038–2053 (2018)

    Article  MathSciNet  Google Scholar 

  7. Bouhmala, N.: Combining simulated annealing with local search heuristic for max-sat. J. Heur. 25(1), 47–69 (2019)

    Article  Google Scholar 

  8. Calvin, J., Žilinskas, A.: On the convergence of the p-algorithm for one-dimensional global optimization of smooth functions. J. Optim. Theory Appl. 102(3), 479–495 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  9. Chang, K.H., Hong, L.J., Wan, H.: Stochastic trust-region response-surface method (STRONG): a new response-surface framework for simulation optimization. INFORMS J. Comput. 25(2), 230–243 (2013)

    Article  MathSciNet  Google Scholar 

  10. Chen, C.H.: Stochastic Simulation Optimization: An Optimal Computing Budget Allocation, vol. 1. World Scientific, Singapore (2010)

    Book  Google Scholar 

  11. Conn, A.R., Gould, N.I., Toint, P.L.: Trust Region Methods. SIAM, Philadelphia (2000)

    Book  MATH  Google Scholar 

  12. Efron, B., Tibshirani, R.: Improvements on cross-validation: the 632+ bootstrap method. J. Am. Stat. Assoc. 92(438), 548–560 (1997)

    MathSciNet  MATH  Google Scholar 

  13. Fu, M.C.: Handbook of Simulation Optimization, vol. 216. Springer, Berlin (2015)

    MATH  Google Scholar 

  14. Glidewell, M., Ng, K., Hensel, E.: A combinatorial optimization approach as a pre-processor for impedance tomography. In: Proceedings of the Annual Conference of the IEEE/Engineering in Medicine and Biology Society (1991)

  15. Hart, W.E.: Sequential stopping rules for random optimization methods with applications to multistart local search. SIAM J. Optim. 9(1), 270–290 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  16. Hu, X., Shonkwiler, R., Spruill, M.: Random restarts in global optimization. Technical represent, Georgia Institute of Technology (1994)

  17. Jones, D.R., Schonlau, M., Welch, W.J.: Efficient global optimization of expensive black-box functions. J. Global Optim. 13(4), 455–492 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  18. Krityakierne, T., Shoemaker, C.A.: SOMS: surrogate multistart algorithm for use with nonlinear programming for global optimization. Int. Trans. Oper. Res. 24(5), 1139–1172 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  19. Lagaris, I.E., Tsoulos, I.G.: Stopping rules for box-constrained stochastic global optimization. Appl. Math. Comput. 197(2), 622–632 (2008)

    MathSciNet  MATH  Google Scholar 

  20. Li, H., Lim, A.: A meta-heruistic for the pickup and delivery problem with time windows. In: Proceedings of the 13th IEEE International Conference on Tools with Artificial Intelligence, pp 160–167 (2001)

  21. Locatelli, M.: Bayesian algorithms for one-dimensional global optimization. J. Global Optim. 10(1), 57–76 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  22. Locatelli, M.: A note on the Griewank test function. J. Global Optim. 25(2), 169–174 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  23. Locatelli, M., Schoen, F.: Global optimization based on local searches. Ann. Oper. Res. 240(1), 251–270 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  24. Luby, M., Sinclair, A., Zuckerman, D.: Optimal speedup of Las Vegas algorithms. Inf. Process. Lett. 47(4), 173–180 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  25. Luersen, M.A., Le Riche, R.: Globalized nelder-mead method for engineering optimization. Comput. Struct. 82(23), 2251–2260 (2004)

    Article  Google Scholar 

  26. Mahinthakumar, G., Sayeed, M.: Hybrid genetic algorithm–local search methods for solving groundwater source identification inverse problems. J. Water Resour. Plan. Manag. 131(1), 45–57 (2005)

    Article  Google Scholar 

  27. Martí, R., Lozano, J.A., Mendiburu, A., Hernando, L.: Multi-start methods. Handbook of Heuristics pp 1–21 (2016)

  28. Martin, O.C., Otto, S.W.: Combining simulated annealing with local search heuristics. Ann. Oper. Res. 63(1), 57–75 (1996)

    Article  MATH  Google Scholar 

  29. Mathesen, L., Pedrielli, G., Ng, S.H.: Trust region based stochastic optimization with adaptive restart: a family of global optimization algorithms. In: 2017 Winter Simulation Conference (WSC), pp 2104–2115 (2017). https://doi.org/10.1109/WSC.2017.8247943

  30. Müller, J., Day, M.: Surrogate optimization of computationally expensive black-box problems with hidden constraints. INFORMS J. Comput. 31(4), 689–702 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  31. Murphy, M., Baker, E.: GLO: Global local optimizer. LLNL unclassified code 960007 (1995)

  32. Neumann, F., Witt, C.: Runtime analysis of a simple ant colony optimization algorithm. Algorithmica 54(2), 243 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  33. Nocedal, J., Wright, S.J.: Trust-region methods. Numerical Optimization, pp 66–100 (2006)

  34. O’Donoghue, B., Candès, E.: Adaptive restart for accelerated gradient schemes. Found. Comput. Math. 15(3), 715–732 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  35. Ohsaki, M., Yamakawa, M.: Stopping rule of multi-start local search for structural optimization. Struct. Multidiscip. Optim. 57(2), 595–603 (2018)

    Article  Google Scholar 

  36. Okamoto, M., Nonaka, T., Ochiai, S., Tominaga, D.: Nonlinear numerical optimization with use of a hybrid genetic algorithm incorporating the modified Powell method. Appl. Math. Comput. 91(1), 63–72 (1998)

    Article  MathSciNet  Google Scholar 

  37. Pardalos, P.M., Romeijn, H.E.: Handbook of Global Optimization, vol. 2. Springer, Berlin (2013)

    Google Scholar 

  38. Peri, D., Tinti, F.: A multistart gradient-based algorithm with surrogate model for global optimization. Commun. Appl. Ind. Math. 3(1), 393 (2012)

    MathSciNet  MATH  Google Scholar 

  39. Ranjan, P., Haynes, R., Karsten, R.: A computationally stable approach to Gaussian process interpolation of deterministic computer simulation data. Technometrics 53(4), 366–378 (2011)

    Article  MathSciNet  Google Scholar 

  40. Regis, R.G., Shoemaker, C.A.: Improved strategies for radial basis function methods for global optimization. J. Global Optim. 37(1), 113–135 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  41. Regis, R.G., Shoemaker, C.A.: Parallel radial basis function methods for the global optimization of expensive functions. Eur. J. Oper. Res. 182(2), 514–535 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  42. Regis, R.G., Shoemaker, C.A.: Parallel stochastic global optimization using radial basis functions. INFORMS J. Comput. 21(3), 411–426 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  43. Regis, R.G., Shoemaker, C.A.: A quasi-multistart framework for global optimization of expensive functions using response surface models. J. Global Optim. 56(4), 1719–1753 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  44. Santner, T.J., Williams, B.J., Notz, W.I.: The Design and Analysis of Computer Experiments. Springer, Berlin (2013)

    MATH  Google Scholar 

  45. Schoen, F.: Stochastic techniques for global optimization: a survey of recent advances. J. Global Optim. 1(3), 207–228 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  46. Schoen, F.: Two-phase methods for global optimization. In: Handbook of Global Optimization. Springer, pp 151–177 (2002)

  47. Schoen, F.: Two-phase methods for global optimization. In: Pardalos, P., Romeijn, H. (eds.) Handbook of Global Optimization, vol. 2, pp. 151–177. Kluwer Academic Publishers, Dordrecht (2015)

    Google Scholar 

  48. Shang, Y., Wan, Y., Fromherz, M.P., Crawford, L.S.: Toward adaptive cooperation between global and local solvers for continuous constraint problems. In: Proceedings of the CP’01 Workshop on Cooperative Solvers in Constraint Programming (2001)

  49. Spall, J.C.: Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control, vol. 65. Wiley, New York (2005)

    MATH  Google Scholar 

  50. Spall, J.C.: Stochastic optimization. In: Handbook of Computational Statistics. Springer, pp 173–201 (2012)

  51. Theodosopoulos, T.: Some remarks on the optimal level of randomization in global optimization. (2004) arXiv preprint arXiv:math/0406095

  52. Torii, A.J., Lopez, R.H., Luersen, M.A.: A local-restart coupled strategy for simultaneous sizing and geometry truss optimization. Latin Am. J. Solids Struct. 8(3), 335–349 (2011)

    Article  Google Scholar 

  53. Van Harmelen, F., Lifschitz, V., Porter, B.: Handbook of Knowledge Representation, vol. 1. Elsevier, London (2008)

    MATH  Google Scholar 

  54. Vehtari, A., Gelman, A., Gabry, J.: Practical Bayesian model evaluation using leave-one-out cross-validation and waic. Stat. Comput. 27(5), 1413–1432 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  55. Voglis, C., Lagaris, I.E.: Towards ideal multistart: a stochastic approach for locating the minima of a continuous function inside a bounded domain. Appl. Math. Comput. 213(1), 216–229 (2009)

    MathSciNet  MATH  Google Scholar 

  56. Yang, X.S.: Nature-Inspired Optimization Algorithms. Elsevier, London (2014)

    MATH  Google Scholar 

  57. Zabinsky, Z.B.: Stochastic Adaptive Search for Global Optimization. Kluwer Academic Publishers, Berlin (2003)

    Book  MATH  Google Scholar 

  58. Zabinsky, Z.B.: Stochastic search methods for global optimization. In: Wiley Encyclopedia of Operations Research and Management Science. Wiley (2011)

  59. Zabinsky, Z.B., Bulger, D., Khompatraporn, C.: Stopping and restarting strategy for stochastic sequential search in global optimization. J. Global Optim. 46(2), 273–286 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  60. Zafar, A., Ghafoor, U., Yaqub, M.A., Hong, K.: Determination of the parameters in the designed hemodynamic response function using Nelder-Mead algorithm. In: 2018 18th International Conference on Control, Automation and Systems (ICCAS), pp 1135–1140 (2018)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Giulia Pedrielli.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: details of Gaussian processes

The basic idea behind meta-models such as Gaussian processes is that any smooth function \(f(\textit{\textbf{x}})\) can be interpreted as a realization from a stationary Gaussian process. Given n points in \( \mathbb {X}\), \(\mathbb {S}=\{\textit{\textbf{x}}_1, \ldots , \textit{\textbf{x}}_n\}\), and associated function values, \(f(\textit{\textbf{x}}_i\)), for \(i=1, \ldots , n\), we construct a stationary Gaussian process \(F(\textit{\textbf{x}})=\mu + Z(\textit{\textbf{x}})\) around the smooth function \(f(\textit{\textbf{x}})\) using the n observed points. Here, \(\mu \) is the constant mean (more complex models can be considered  [44]), and \(Z(\textit{\textbf{x}})\) is the Gaussian process \(Z(\textit{\textbf{x}})\sim GP(0,\tau ^2\mathbf {R})\) with spatial correlation matrix \(\mathbf {R}\) and overall process variance \(\tau ^2\).

We adopt the Gaussian correlation function such that \(R_{ij} = \prod _{l=1}^d e^{-(\theta _l|x_{il}-x_{jl}|)^2}\), for \(i,j = 1, \ldots , n\), and where d is the dimension of the vector of hyper parameters \(\varvec{\theta }\) that collects the correlation factors in charge of smoothing the predictor with varying intensity. The two parameters \(\mu \) and \(\tau ^2\) can be estimated through the following maximum likelihood estimators  [44],

$$\begin{aligned} \hat{\mu }&= \frac{\textit{\textbf{1}}_n^T\mathbf {R}^{-1}\textit{\textbf{f}}}{\textit{\textbf{1}}^T_n\mathbf {R}^{-1}\textit{\textbf{1}}_n}, \\ \tau ^2&= \frac{(\textit{\textbf{f}}-\textit{\textbf{1}}_n\hat{\mu })^T \mathbf {R}^{-1}(\textit{\textbf{f}}-\textit{\textbf{1}}_n\hat{\mu })}{n} \end{aligned}$$

where \(\textit{\textbf{f}}\) represents the n-dimensional vector of the function evaluations at the sampled points within the set \(\mathbb {S}=\left\{ \textit{\textbf{x}}_{1},\textit{\textbf{x}}_{2},\ldots ,\textit{\textbf{x}}_{n}\right\} \), and \( \textit{\textbf{1}}_n\) is an n-vector of ones.

Thus, we can build a predictor function \(\hat{f}(\textit{\textbf{x}})\) and model variance \(\hat{s}^2(\textit{\textbf{x}})\) for \(\textit{\textbf{x}}\in \mathbb {X}\) following  [44].

The predictor function is defined for any \(\textit{\textbf{x}} \in {\mathbb {X}}\):

$$\begin{aligned} \hat{f}(\textit{\textbf{x}}) = \hat{\mu } + \mathbf {r}(\textit{\textbf{x}})^T\mathbf {R}^{-1}(\textit{\textbf{f}} - \mathbf {1}_n\hat{\mu }) \end{aligned}$$

with a corresponding predictive model variance of:

$$\begin{aligned} \text{ Var }[\hat{f}(\textit{\textbf{x}})]= & {} \hat{s}^2(\textit{\textbf{x}})\\= & {} \tau ^2\left( 1-\mathbf {r}(\textit{\textbf{x}})^T\mathbf {R}^{-1}\mathbf {r}(\textit{\textbf{x}}) + \frac{(1-\mathbf {1}_n^T\mathbf {R}^{-1}\mathbf {r}(\textit{\textbf{x}}))^2}{\mathbf {1}_n^T\mathbf {R}^{-1}\mathbf {1}_n} \right) . \end{aligned}$$

where \(\mathbf {r}\) is the n-dimensional vector of the correlations between the predicted variance at location \(\textit{\textbf{x}}\) and the observed model error at the sampled locations \(\textit{\textbf{x}}\in \mathbb {S}\):

$$\begin{aligned} r_i(\textit{\textbf{x}}) = \text{ Corr }(Z(\textit{\textbf{x}}),Z(\textit{\textbf{x}}_i)),\textit{\textbf{x}}_i\in \mathbb {S}. \end{aligned}$$

The interested reader can refer to  [44] and references.

Appendix B: details of the trust region algorithm

The implementation of SOAR in this paper uses the trust region method  [11, 33] as its local search, an iterative derivative-free optimization approach that, under appropriate conditions, is guaranteed to converge to a local minimum  [33]. An iteration of the trust region algorithm approximates the objective function (adopting usually a linear or quadratic model) around a location referred to as centroid of the search, and proceeds to minimize the approximating function over a “trust region” (usually a simplex or ellipsoid). The approximation allows to easily solve the minimization problem. The solution is used as the centroid location for the next iteration if the linear/quadratic surface is a good approximation for the true function within the trust region.

More specifically, the \(\ell _{k}^\mathrm{th}\) iteration of our implementation of the trust region method starts with a centroid \(\tilde{\textit{\textbf{x}}}^{c}_{\ell _{k}}\) of an associated hypercube as a trust region, denoted \(\mathbb {R}_{\ell _{k}}\), with length of the side of the hypercube equal to \(r_{\ell _{k}}\). An iteration of the algorithm results in a centroid update and/or a trust region size update, achieved by minimizing a quadratic model, \(\hat{f}_Q(\textit{\textbf{x}})\). The quadratic model is given by:

$$\begin{aligned} \hat{f}_Q({\textit{\textbf{x}}}) =&f(\varvec{\tilde{x}}^{c}_{\ell _{k}}) + \nabla f(\varvec{\tilde{x}}^{c}_{\ell _{k}})(\varvec{\tilde{x}}^{c}_{\ell _{k}} - \textit{\textbf{x}}) \nonumber \\&+ \frac{1}{2}\nabla ^2 f(\varvec{\tilde{x}}^{c}_{\ell _{k}})(\varvec{\tilde{x}}^{c}_{\ell _{k}} - \textit{\textbf{x}})^2 \end{aligned}$$
(13)

where \(\nabla f(\varvec{\tilde{x}}^{c}_{\ell _{k}})\) is an approximation of the gradient at the centroid, and \(\nabla ^2 f(\varvec{\tilde{x}}^{c}_{\ell _{k}})\) is an approximation of the Hessian matrix. Hence, the Trust Region Sub-problem (TRS):

$$\begin{aligned} \text{(TRS) } \ {\tilde{\textit{\textbf{x}}}}^{*}_{\ell _{k}}\arg \min&\ \ \hat{f}_Q({\textit{\textbf{x}}}) \nonumber \\ s.t.\&\ \ {\textit{\textbf{x}}} \in \mathbb {R}_{\ell _{k}}. \end{aligned}$$
(14)

where \({\tilde{\textit{\textbf{x}}}}^{*}_{\ell _{k}}\) becomes the candidate centroid for the next trust region iteration.

Two decisions need to be made by the algorithm to progress to the next iteration: (1) whether to accept the candidate centroid \({\tilde{\textit{\textbf{x}}}}^{*}_{\ell _{k}}\); and (2) whether to shrink or expand the size parameter \(r_{\ell _{k}}\). The ratio-comparison test answers these questions by comparing the true function value \(f(\tilde{\textit{\textbf{x}}}^{*}_{\ell _{k}})\) to the predicted value from the quadratic model \(\hat{f}_Q(\tilde{\textit{\textbf{x}}}^{*}_{\ell _{k}})\), and constructing the following statistic  [9, 33],

$$\begin{aligned} \rho = \frac{f(\varvec{\tilde{x}}^{c}_{\ell _{k}})-f(\tilde{\textit{\textbf{x}}}^{*}_{\ell _{k}})}{\hat{f}_Q(\varvec{\tilde{x}}^{c}_{\ell _{k}})-\hat{f}_Q(\tilde{\textit{\textbf{x}}}^{*}_{\ell _{k}})}. \end{aligned}$$
(15)

when \(\rho > 1\), then we have produced a better than predicted reduction over the objective function, while \(\rho < 0\) indicates that the current model is inadequate over the current trust region, and while a reduction in the objective value was predicted, the trust region step produced a worsening solution, i.e., \(f(\varvec{\tilde{x}}^{c}_{\ell _{k}})<f(\tilde{\textit{\textbf{x}}}^{*}_{\ell _{k}})\).

Comparing \(\rho \) with the user defined threshold values \(\eta _{1},\eta _{2}\), \(0<\eta _1\le \eta _2<1\), we have three possible cases:

Case 1:

\(\rho \le \eta _1 \implies \) keep current centroid \(\tilde{\textit{\textbf{x}}}^{c}_{\ell _{k}+1} \leftarrow \tilde{\textit{\textbf{x}}}^{c}_{\ell _{k}}\); reduce trust region size \(r_{\ell _{k}+1} \leftarrow r_{k_\ell } \cdot \omega \).

Case 2:

\(\eta _1 < \rho \le \eta _2 \implies \) accept candidate centroid \(\tilde{\textit{\textbf{x}}}^{c}_{\ell _{k}+1} \leftarrow \varvec{\tilde{x}}^{*}_{\ell _{k}}\); keep current trust region size \(r_{\ell _{k}+1}\leftarrow r_{\ell _{k}}\).

Case 3:

\(\rho > \eta _2 \implies \) accept candidate centroid \(\tilde{\textit{\textbf{x}}}^{c}_{\ell _{k}+1} \leftarrow \varvec{\tilde{x}}^{*}_{\ell _{k}}\); expand trust region size \(r_{\ell _{k}+1} \leftarrow r_{\ell _{k}} \cdot \gamma \).

where \(\omega \in (0,1)\) is the trust region reduction rate and \(\gamma > 1\) is the trust region expansion rate.

Under Case 1, we reformulate the subproblem (TRS) with the same local quadratic model but under more restrictive trust region constraints. Under Case 2 or Case 3, we recompute \(\nabla f(\tilde{\textit{\textbf{x}}}^{c}_{\ell _{k}})\) and \(\nabla ^2 f(\tilde{\textit{\textbf{x}}}^{c}_{\ell _{k}})\) in order to update the quadratic model \(\hat{f}_Q(\textit{\textbf{x}})\), since the trust region centroid has moved.

The following stopping criteria are used (in addition to the restart-related conditions):

Criterion 1:

\(\Vert \nabla f(\tilde{\textit{\textbf{x}}}^{*}_{\ell _{k}}) \Vert < \epsilon _1\)

Criterion 2:

\(\Vert \varvec{\delta }_{\ell _{k}} \Vert < \epsilon _2\)

where \(\Vert \cdot \Vert \) is the Euclidean norm, \(\epsilon _{1}, \epsilon _{2} > 0\) are user-defined parameters, and \(\varvec{\delta }_{\ell _{k}} = \tilde{\textit{\textbf{x}}}^{*}_{\ell _{k}} - \tilde{\textit{\textbf{x}}}^{c}_{\ell _{k}}\) represents the step size between the current and proposed centroid. Note that \(\mathbb {R}_{\ell _{k}} \rightarrow \mathbb {X}\) as \(\ell _{k} \rightarrow \infty \), and \(||\varvec{\delta }_{\ell _{k}}||_{2} \rightarrow 0\) as \(\ell _{k} \rightarrow \infty \).

It was shown in  [33] (Theorem 4.8) that, when the iterations grow \(\ell _{k}\rightarrow \infty \):

$$\begin{aligned} \tilde{\textit{\textbf{x}}}^{c}_{\infty } \in {\mathbb {X}} : f(\tilde{\textit{\textbf{x}}}^{c}_{\infty }) \le f(\textit{\textbf{x}}), \ \forall \textit{\textbf{x}} \in \mathcal {N}_\epsilon (\tilde{\textit{\textbf{x}}}^{c}_{\infty }), \ \forall \ \epsilon > 0 \end{aligned}$$
(16)

regardless of the initial starting location of the trust region search, where \(\tilde{\textit{\textbf{x}}}^{c}_{\infty }\) is the centroid in the limit, and \(\mathcal {N}_\epsilon (\tilde{\textit{\textbf{x}}}^{c}_{\infty })\) is an \(\epsilon \) radius ball centered at \(\tilde{\textit{\textbf{x}}}^{c}_{\infty }\). The result requires (1) Lipschitz continuity for f, (2) boundedness of f, (3) \(\eta _1 \in (0,0.25)\), and (4) uniform boundedness in norm of the Hessian of f by a constant \(\beta \).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mathesen, L., Pedrielli, G., Ng, S.H. et al. Stochastic optimization with adaptive restart: a framework for integrated local and global learning. J Glob Optim 79, 87–110 (2021). https://doi.org/10.1007/s10898-020-00937-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10898-020-00937-5

Keywords

Navigation