Derivative-free optimization methods

Jeffrey Larson; Matt Menickelly; Stefan M. Wild

doi:10.1017/S0962492919000060

Derivative-free optimization methods

Published online by Cambridge University Press: 14 June 2019

Jeffrey Larson ,

Matt Menickelly and

Stefan M. Wild

Show author details

Jeffrey Larson: Affiliation:
Mathematics and Computer Science Division, Argonne National Laboratory, Lemont, IL 60439, USA E-mail: jmlarson@anl.gov, mmenickelly@anl.gov, wild@anl.gov
Matt Menickelly: Affiliation:
Mathematics and Computer Science Division, Argonne National Laboratory, Lemont, IL 60439, USA E-mail: jmlarson@anl.gov, mmenickelly@anl.gov, wild@anl.gov
Stefan M. Wild: Affiliation:
Mathematics and Computer Science Division, Argonne National Laboratory, Lemont, IL 60439, USA E-mail: jmlarson@anl.gov, mmenickelly@anl.gov, wild@anl.gov

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

In many optimization problems arising from scientific, engineering and artificial intelligence applications, objective and constraint functions are available only as the output of a black-box or simulation oracle that does not provide derivative information. Such settings necessitate the use of methods for derivative-free, or zeroth-order, optimization. We provide a review and perspectives on developments in these methods, with an emphasis on highlighting recent developments and on unifying treatment of such problems in the non-linear optimization and machine learning literature. We categorize methods based on assumed properties of the black-box functions, as well as features of the methods. We first overview the primary setting of deterministic methods applied to unconstrained, non-convex optimization problems where the objective function is defined by a deterministic black-box oracle. We then discuss developments in randomized methods, methods that assume some additional structure about the objective (including convexity, separability and general non-smooth compositions), methods for problems where the output of the black-box oracle is stochastic, and methods for handling different types of constraints.

Type: Research Article
Information: Acta Numerica , Volume 28 , 01 May 2019 , pp. 287 - 404

DOI: https://doi.org/10.1017/S0962492919000060 [Opens in a new window]
Copyright: © Cambridge University Press, 2019

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

REFERENCES 12

Abramson, M. A. (2005), ‘Second-order behavior of pattern search’, SIAM J. Optim. 16, 515–530.Google Scholar

Abramson, M. A. and Audet, C. (2006), ‘Convergence of mesh adaptive direct search to second-order stationary points’, SIAM J. Optim. 17, 606–619.Google Scholar

Abramson, M. A., Audet, C. and Dennis, J. E. Jr (2004), ‘Generalized pattern searches with derivative information’, Math. Program. 100, 3–25.Google Scholar

Abramson, M. A., Audet, C., Chrissis, J. and Walston, J. (2009a), ‘Mesh adaptive direct search algorithms for mixed variable optimization’, Optim. Lett. 3, 35–47.Google Scholar

Abramson, M. A., Audet, C., Dennis, J. E. Jr and Le Digabel, S. (2009b), ‘OrthoMADS: A deterministic MADS instance with orthogonal directions’, SIAM J. Optim. 20, 948–966.Google Scholar

Abramson, M. A., Frimannslund, L. and Steihaug, T. (2013), ‘A subclass of generating set search with convergence to second-order stationary points’, Optim. Methods Software 29, 900–918.Google Scholar

Abramson, M., Audet, C. and Dennis, J. E. Jr (2007), ‘Filter pattern search algorithms for mixed variable constrained optimization problems’, Pacific J. Optim. 3, 477–500.Google Scholar

Agarwal, A., Dekel, O. and Xiao, L. (2010), Optimal algorithms for online convex optimization with multi-point bandit feedback. In 23rd Conference on Learning Theory (COLT 2010) (Kalai, A. T. and Mohri, M., eds), pp. 28–40.Google Scholar

Agarwal, A., Foster, D. P., Hsu, D. J., Kakade, S. M. and Rakhlin, A. (2011), Stochastic convex optimization with bandit feedback. In Advances in Neural Information Processing Systems 24 (Shawe-Taylor, J. et al. , eds), Curran Associates, pp. 1035–1043.Google Scholar

Agarwal, A., Wainwright, M. J., Bartlett, P. L. and Ravikumar, P. K. (2009), Information-theoretic lower bounds on the oracle complexity of convex optimization. In Advances in Neural Information Processing Systems 22 (Bengio, Y. et al. , eds), Curran Associates, pp. 1–9.Google Scholar

Agrawal, R. (1995), ‘Sample mean based index policies with

$O(\log n)$ regret for the multi-armed bandit problem’, Adv. Appl. Probab. 27, 1054–1078.Google Scholar

Alarie, S., Amaioua, N., Audet, C., Le Digabel, S. and Leclaire, L.-A. (2018), Selection of variables in parallel space decomposition for the mesh adaptive direct search algorithm. Technical report, Cahier du GERAD G-2018-38, GERAD.Google Scholar

Alexandrov, N., Dennis, J. E. Jr, Lewis, R. M. and Torczon, V. (1998), ‘A trust region framework for managing the use of approximation models in optimization’, Struct. Optim. 15, 16–23.Google Scholar

Amaioua, N., Audet, C., Conn, A. R. and Le Digabel, S. (2018), ‘Efficient solution of quadratically constrained quadratic subproblems within the mesh adaptive direct search algorithm’, European J. Oper. Res. 268, 13–24.Google Scholar

Amaran, S., Sahinidis, N. V., Sharda, B. and Bury, S. J. (2015), ‘Simulation optimization: A review of algorithms and applications’, Ann. Oper. Res. 240, 351–380.Google Scholar

Anderson, H. L. (1986), ‘Scientific uses of the MANIAC’, J. Statist. Phys. 43, 731–748.Google Scholar

Arouxét, M. B., Echebest, N. and Pilotta, E. A. (2011), ‘Active-set strategy in Powell’s method for optimization without derivatives’, Comput. Appl. Math. 30, 171–196.Google Scholar

Audet, C. (2004), ‘Convergence results for generalized pattern search algorithms are tight’, Optim. Engng 5, 101–122.Google Scholar

Audet, C. (2014), A survey on direct search methods for blackbox optimization and their applications. In Mathematics Without Boundaries: Surveys in Interdisciplinary Research (Pardalos, P. M. and Rassias, T. M., eds), Springer, pp. 31–56.Google Scholar

Audet, C. and Dennis, J. E. Jr (2000), ‘Pattern search algorithms for mixed variable programming’, SIAM J. Optim. 11, 573–594.Google Scholar

Audet, C. and Dennis, J. E. Jr (2002), ‘Analysis of generalized pattern searches’, SIAM J. Optim. 13, 889–903.Google Scholar

Audet, C. and Dennis, J. E. Jr (2004), ‘A pattern search filter method for nonlinear programming without derivatives’, SIAM J. Optim. 14, 980–1010.Google Scholar

Audet, C. and Dennis, J. E. Jr (2006), ‘Mesh adaptive direct search algorithms for constrained optimization’, SIAM J. Optim. 17, 188–217.Google Scholar

Audet, C. and Dennis, J. E. Jr (2009), ‘A progressive barrier for derivative-free nonlinear programming’, SIAM J. Optim. 20, 445–472.Google Scholar

Audet, C. and Hare, W. L. (2017), Derivative-Free and Blackbox Optimization, Springer.Google Scholar

Audet, C., Bigeon, J., Cartier, D., Le Digabel, S. and Salomon, L. (2018a), Performance indicators in multiobjective optimization. Technical report, Cahier du GERAD G-2018-90, GERAD.Google Scholar

Audet, C., Dennis, J. E. Jr and Le Digabel, S. (2008a), ‘Parallel space decomposition of the mesh adaptive direct search algorithm’, SIAM J. Optim. 19, 1150–1170.Google Scholar

Audet, C., Ianni, A., Le Digabel, S. and Tribes, C. (2014), ‘Reducing the number of function evaluations in mesh adaptive direct search algorithms’, SIAM J. Optim. 24, 621–642.Google Scholar

Audet, C., Ihaddadene, A., Le Digabel, S. and Tribes, C. (2018b), ‘Robust optimization of noisy blackbox problems using the mesh adaptive direct search algorithm’, Optim. Lett. 12, 675–689.Google Scholar

Audet, C., Le Digabel, S. and Peyrega, M. (2015), ‘Linear equalities in blackbox optimization’, Comput. Optim. Appl. 61, 1–23.Google Scholar

Audet, C., Savard, G. and Zghal, W. (2008b), ‘Multiobjective optimization through a series of single-objective formulations’, SIAM J. Optim. 19, 188–210.Google Scholar

Audet, C., Savard, G. and Zghal, W. (2010), ‘A mesh adaptive direct search algorithm for multiobjective optimization’, European J. Oper. Res. 204, 545–556.Google Scholar

Auer, P. (2002), ‘Using confidence bounds for exploitation-exploration trade-offs’, J. Mach. Learn. Res. 3, 397–422.Google Scholar

Auer, P., Cesa-Bianchi, N. and Fischer, P. (2002), ‘Finite-time analysis of the multiarmed bandit problem’, Machine Learning 47, 235–256.Google Scholar

Auer, P., Cesa-Bianchi, N., Freund, Y. and Schapire, R. E. (2003), ‘The nonstochastic multiarmed bandit problem’, SIAM J. Comput. 32, 48–77.Google Scholar

Auger, A., Hansen, N., Perez Zerpa, J., Ros, R. and Schoenauer, M. (2009), Experimental comparisons of derivative free optimization algorithms. In Experimental Algorithms (SEA 2009) (Vahrenhold, J., ed.), Vol. 5526 of Lecture Notes in Computer Science, Springer, pp. 3–15.Google Scholar

Augustin, F. and Marzouk, Y. M. (2014), NOWPAC: A provably convergent derivative-free nonlinear optimizer with path-augmented constraints. arXiv:1403.1931 Google Scholar

Augustin, F. and Marzouk, Y. M. (2017), A trust-region method for derivative-free nonlinear constrained stochastic optimization. arXiv:1703.04156 Google Scholar

Avriel, M. and Wilde, D. J. (1967), ‘Optimal condenser design by geometric programming’, Ind. Engng Chem. Process Des. Dev. 6, 256–263.Google Scholar

Bach, F. and Perchet, V. (2016), Highly-smooth zero-th order online optimization. In 29th Annual Conference on Learning Theory (COLT 2016) (Feldman, V. et al. , eds). Proc. Mach. Learn. Res. 49, 257–283.Google Scholar

Bagirov, A. M. and Ugon, J. (2006), ‘Piecewise partially separable functions and a derivative-free algorithm for large scale nonsmooth optimization’, J. Global Optim. 35, 163–195.Google Scholar

Bagirov, A. M., Karasözen, B. and Sezer, M. (2007), ‘Discrete gradient method: Derivative-free method for nonsmooth optimization’, J. Optim. Theory Appl. 137, 317–334.Google Scholar

Bajaj, I., Iyer, S. S. and Hasan, M. M. F. (2018), ‘A trust region-based two phase algorithm for constrained black-box and grey-box optimization with infeasible initial point’, Comput. Chem. Engng 116, 306–321.Google Scholar

Balaprakash, P., Salim, M., Uram, T. D., Vishwanath, V. and Wild, S. M. (2018), DeepHyper: Asynchronous hyperparameter search for deep neural networks. In 25th IEEE International Conference on High Performance Computing, Data, and Analytics (HiPC18). doi:10.1007/s10589-019-00063-3 Google Scholar

Balasubramanian, K. and Ghadimi, S. (2018), Zeroth-order (non)-convex stochastic optimization via conditional gradient and gradient updates. In Advances in Neural Information Processing Systems 31 (Bengio, S. et al. , eds), Curran Associates, pp. 3459–3468.Google Scholar

Balasubramanian, K. and Ghadimi, S. (2019), Zeroth-order nonconvex stochastic optimization: Handling constraints, high-dimensionality, and saddle-points. arXiv:1809.06474 Google Scholar

Bandeira, A. S., Scheinberg, K. and Vicente, L. N. (2012), ‘Computation of sparse low degree interpolating polynomials and their application to derivative-free optimization’, Math. Program. 134, 223–257.Google Scholar

Bandeira, A. S., Scheinberg, K. and Vicente, L. N. (2014), ‘Convergence of trust-region methods based on probabilistic models’, SIAM J. Optim. 24, 1238–1264.Google Scholar

Banichuk, N. V., Petrov, V. M. and Chernous’ko, F. L. (1966), ‘The solution of variational and boundary value problems by the method of local variations’, USSR Comput. Math. Math. Phys. 6, 1–21.Google Scholar

Barton, R. R. and Ivey, J. S. Jr (1996), ‘Nelder–Mead simplex modifications for simulation optimization’, Manag. Sci. 42, 954–973.Google Scholar

Barzilai, J. and Borwein, J. M. (1988), ‘Two-point step size gradient methods’, IMA J. Numer. Anal. 8, 141–148.Google Scholar

Bauschke, H. H., Hare, W. L. and Moursi, W. M. (2014), ‘A derivative-free comirror algorithm for convex optimization’, Optim. Methods Software 30, 706–726.Google Scholar

Baydin, A. G., Pearlmutter, B. A., Radul, A. A. and Siskind, J. M. (2018), ‘Automatic differentiation in machine learning: A survey’, J. Mach. Learn. Res. 18(153), 1–43.Google Scholar

Belitz, P. and Bewley, T. (2013), ‘New horizons in sphere-packing theory, II: Lattice-based derivative-free optimization via global surrogates’, J. Global Optim. 56, 61–91.Google Scholar

Belloni, A., Liang, T., Narayanan, H. and Rakhlin, A. (2015), Escaping the local minima via simulated annealing: Optimization of approximately convex functions. In 28th Conference on Learning Theory (COLT 2015). Proc. Mach. Learn. Res. 40, 240–265.Google Scholar

Belotti, P., Kirches, C., Leyffer, S., Linderoth, J., Luedtke, J. and Mahajan, A. (2013), Mixed-integer nonlinear optimization. In Acta Numerica, Vol. 22, Cambridge University Press, pp. 1–131.Google Scholar

Berahas, A. S., Byrd, R. H. and Nocedal, J. (2019), ‘Derivative-free optimization of noisy functions via quasi-Newton methods’, SIAM J. Optim., to appear. arXiv:1803.10173 Google Scholar

Bertsimas, D. and Nohadani, O. (2010), ‘Robust optimization with simulated annealing’, J. Global Optim. 48, 323–334.Google Scholar

Bertsimas, D., Nohadani, O. and Teo, K. M. (2010), ‘Robust optimization for unconstrained simulation-based problems’, Oper. Res. 58, 161–178.Google Scholar

Bhatnagar, S., Prasad, H. and Prashanth, L. A. (2013), Kiefer–Wolfowitz algorithm. In Stochastic Recursive Algorithms for Optimization: Simultaneous Perturbation Methods, Vol. 434 of Lecture Notes in Control and Information Sciences, Springer, pp. 31–39.Google Scholar

Bibi, A., Bergou, E. H., Sener, O., Ghanem, B. and Richtárik, P. (2019), A stochastic derivative-free optimization method with importance sampling. arXiv:1902.01272 Google Scholar

Billups, S. C., Larson, J. and Graf, P. (2013), ‘Derivative-free optimization of expensive functions with computational error using weighted regression’, SIAM J. Optim. 55, 27–53.Google Scholar

Björkman, M. and Holmström, K. (2000), ‘Global optimization of costly nonconvex functions using radial basis functions’, Optim. Engng 1, 373–397.Google Scholar

Blanchet, J., Cartis, C., Menickelly, M. and Scheinberg, K. (2019), ‘Convergence rate analysis of a stochastic trust region method via submartingales’, INFORMS J. Optim., to appear. arXiv:1609.07428 Google Scholar

Blum, J. R. (1954a), ‘Approximation methods which converge with probability one’, Ann. Math. Statist. 25, 382–386.Google Scholar

Blum, J. R. (1954b), ‘Multidimensional stochastic approximation methods’, Ann. Math. Statist. 25, 737–744.Google Scholar

Boender, C. G. E., Rinnooy Kan, A. H. G., Timmer, G. T. and Stougie, L. (1982), ‘A stochastic method for global optimization’, Math. Program. 22, 125–140.Google Scholar

Bogani, C., Gasparo, M. G. and Papini, A. (2009), ‘Generalized pattern search methods for a class of nonsmooth optimization problems with structure’, J. Comput. Appl. Math. 229, 283–293.Google Scholar

Bortz, D. M. and Kelley, C. T. (1998), The simplex gradient and noisy optimization problems. In Computational Methods for Optimal Design and Control (Borggaard, J. et al. , eds), Vol. 24 of Progress in Systems and Control Theory, Birkhäuser, pp. 77–90.Google Scholar

Bottou, L., Curtis, F. E. and Nocedal, J. (2018), ‘Optimization methods for large-scale machine learning’, SIAM Review 60, 223–311.Google Scholar

Box, G. E. P. and Draper, N. R. (1987), Empirical Model Building and Response Surfaces, Wiley.Google Scholar

Box, M. J. (1965), ‘A new method of constrained optimization and a comparison with other methods’, Comput. J. 8, 42–52.Google Scholar

Box, M. J. (1966), ‘A comparison of several current optimization methods, and the use of transformations in constrained problems’, Comput. J. 9, 67–77.Google Scholar

Brekelmans, R., Driessen, L., Hamers, H. and den Hertog, D. (2005), ‘Constrained optimization involving expensive function evaluations: A sequential approach’, European J. Oper. Res. 160, 121–138.Google Scholar

Brent, R. P. (1973), Algorithms for Minimization Without Derivatives, Prentice Hall.Google Scholar

Brown, K. M. and Dennis, J. E. Jr (1971), ‘Derivative free analogues of the Levenberg–Marquardt and Gauss algorithms for nonlinear least squares approximation’, Numer. Math. 18, 289–297.Google Scholar

Bubeck, S. and Cesa-Bianchi, N. (2012), ‘Regret analysis of stochastic and nonstochastic multi-armed bandit problems’, Found. Trends Mach. Learn. 5, 1–122.Google Scholar

Bubeck, S., Lee, Y. T. and Eldan, R. (2017), Kernel-based methods for bandit convex optimization. In 49th Annual ACM SIGACT Symposium on Theory of Computing (STOC 2017), ACM, pp. 72–85.Google Scholar

Bubeck, S., Munos, R., Stoltz, G. and Szepesvári, C. (2011a), ‘

${\mathcal{X}}$ -armed bandits’, J. Mach. Learn. Res. 12, 1655–1695.Google Scholar

Bubeck, S., Stoltz, G. and Yu, J. Y. (2011b), Lipschitz bandits without the Lipschitz constant. In Algorithmic Learning Theory (ALT 2011) (Kivinen, J. et al. , eds), Vol. 6925 of Lecture Notes in Computer Science, Springer, pp. 144–158.Google Scholar

Bueno, L. F., Friedlander, A., Martínez, J. M. and Sobral, F. N. C. (2013), ‘Inexact restoration method for derivative-free optimization with smooth constraints’, SIAM J. Optim. 23, 1189–1213.Google Scholar

Buhmann, M. D. (2000), Radial basis functions. In Acta Numerica, Vol. 9, Cambridge University Press, pp. 1–38.Google Scholar

Bull, A. D. (2011), ‘Convergence rates of efficient global optimization algorithms’, J. Mach. Learn. Res. 12, 2879–2904.Google Scholar

Burke, J. V., Curtis, F. E., Lewis, A. S., Overton, M. L. and Simões, L. E. A. (2018), Gradient sampling methods for nonsmooth optimization. arXiv:1804.11003 Google Scholar

Bűrmen, Á., Olenšek, J. and Tuma, T. (2015), ‘Mesh adaptive direct search with second directional derivative-based Hessian update’, Comput. Optim. Appl. 62, 693–715.Google Scholar

Bűrmen, Á., Puhan, J. and Tuma, T. (2006), ‘Grid restrained Nelder–Mead algorithm’, Comput. Optim. Appl. 34, 359–375.Google Scholar

Carter, R. G., Gablonsky, J. M., Patrick, A., Kelley, C. T. and Eslinger, O. J. (2001), ‘Algorithms for noisy problems in gas transmission pipeline optimization’, Optim. Engng 2, 139–157.Google Scholar

Cartis, C. and Roberts, L. (2017), A derivative-free Gauss–Newton method. arXiv:1710.11005 Google Scholar

Cartis, C. and Scheinberg, K. (2018), ‘Global convergence rate analysis of unconstrained optimization methods based on probabilistic models’, Math. Program. 169, 337–375.Google Scholar

Cartis, C., Fiala, J., Marteau, B. and Roberts, L. (2018), Improving the flexibility and robustness of model-based derivative-free optimization solvers. arXiv:1804.00154 Google Scholar

Cartis, C., Gould, N. I. M. and Toint, P. L. (2012), ‘On the oracle complexity of first-order and derivative-free algorithms for smooth nonconvex minimization’, SIAM J. Optim. 22, 66–86.Google Scholar

Cartis, C., Gould, N. I. M. and Toint, P. L. (2018), Sharp worst-case evaluation complexity bounds for arbitrary-order nonconvex optimization with inexpensive constraints. arXiv:1811.01220 Google Scholar

Castle, B. (2012), Quasi-Newton methods for stochastic optimization and proximity-based methods for disparate information fusion. PhD thesis, Indiana University.Google Scholar

Céa, J. (1971), Optimisation: Théorie et Algorithmes, Méthodes Mathématiques de l’Informatique, Dunod.Google Scholar

Chandramouli, G. and Narayanan, V. (2019), A scaled conjugate gradient based direct search algorithm for high dimensional box constrained derivative free optimization. arXiv:1901.05215 Google Scholar

Chang, K.-H. (2012), ‘Stochastic Nelder–Mead simplex method: A new globally convergent direct search method for simulation optimization’, European J. Oper. Res. 220, 684–694.Google Scholar

Chang, K.-H., Hong, L. J. and Wan, H. (2013), ‘Stochastic trust-region response-surface method (STRONG): A new response-surface framework for simulation optimization’, INFORMS J. Comput. 25, 230–243.Google Scholar

Chen, H. and Schmeiser, B. W. (2001), ‘Stochastic root finding via retrospective approximation’, IIE Trans. 33, 259–275.Google Scholar

Chen, R. (2015), Stochastic derivative-free optimization of noisy functions. PhD thesis, Lehigh University.Google Scholar

Chen, R., Menickelly, M. and Scheinberg, K. (2018a), ‘Stochastic optimization using a trust-region method and random models’, Math. Program. 169, 447–487.Google Scholar

Chen, X. and Kelley, C. T. (2016), ‘Optimization with hidden constraints and embedded Monte Carlo computations’, Optim. Engng 17, 157–175.Google Scholar

Chen, X., Kelley, C. T., Xu, F. and Zhang, Z. (2018b), ‘A smoothing direct search method for Monte Carlo-based bound constrained composite nonsmooth optimization’, SIAM J. Sci. Comput. 40, A2174–A2199.Google Scholar

Choi, T. D. and Kelley, C. T. (2000), ‘Superlinear convergence and implicit filtering’, SIAM J. Optim. 10, 1149–1162.Google Scholar

Choi, T. D., Eslinger, O. J., Kelley, C. T., David, J. W. and Etheridge, M. (2000), ‘Optimization of automotive valve train components with implicit filtering’, Optim. Engng 1, 9–27.Google Scholar

Ciccazzo, A., Latorre, V., Liuzzi, G., Lucidi, S. and Rinaldi, F. (2015), ‘Derivative-free robust optimization for circuit design’, J. Optim. Theory Appl. 164, 842–861.Google Scholar

Cocchi, G., Liuzzi, G., Papini, A. and Sciandrone, M. (2018), ‘An implicit filtering algorithm for derivative-free multiobjective optimization with box constraints’, Comput. Optim. Appl. 69, 267–296.Google Scholar

Colson, B. and Toint, P. L. (2001), ‘Exploiting band structure in unconstrained optimization without derivatives’, Optim. Engng 2, 399–412.Google Scholar

Colson, B. and Toint, P. L. (2002), A derivative-free algorithm for sparse unconstrained optimization problems. In Trends in Industrial and Applied Mathematics (Siddiqi, A. H. and Kočvara, M., eds), Vol. 72 of Applied Optimization, Springer, pp. 131–147.Google Scholar

Colson, B. and Toint, P. L. (2005), ‘Optimizing partially separable functions without derivatives’, Optim. Methods Software 20, 493–508.Google Scholar

Conejo, P. D., Karas, E. W. and Pedroso, L. G. (2015), ‘A trust-region derivative-free algorithm for constrained optimization’, Optim. Methods Software 30, 1126–1145.Google Scholar

Conejo, P. D., Karas, E. W., Pedroso, L. G., Ribeiro, A. A. and Sachine, M. (2013), ‘Global convergence of trust-region algorithms for convex constrained minimization without derivatives’, Appl. Math. Comput. 220, 324–330.Google Scholar

Conn, A. R. and Le Digabel, S. (2013), ‘Use of quadratic models with mesh-adaptive direct search for constrained black box optimization’, Optim. Methods Software 28, 139–158.Google Scholar

Conn, A. R. and Toint, P. L. (1996), An algorithm using quadratic interpolation for unconstrained derivative free optimization. In Nonlinear Optimization and Applications (Di Pillo, G. and Giannessi, F., eds), Springer, pp. 27–47.Google Scholar

Conn, A. R. and Vicente, L. N. (2012), ‘Bilevel derivative-free optimization and its application to robust optimization’, Optim. Methods Software 27, 561–577.Google Scholar

Conn, A. R., Gould, N. I. M. and Toint, P. L. (1991), ‘A globally convergent augmented Lagrangian algorithm for optimization with general constraints and simple bounds’, SIAM J. Numer. Anal. 28, 545–572.Google Scholar

Conn, A. R., Gould, N. I. M. and Toint, P. L. (2000), Trust-Region Methods, SIAM.Google Scholar

Conn, A. R., Scheinberg, K. and Toint, P. L. (1997a), On the convergence of derivative-free methods for unconstrained optimization. In Approximation Theory and Optimization: Tributes to M. J. D. Powell (Iserles, A. and Buhmann, M., eds), Cambridge University Press, pp. 83–108.Google Scholar

Conn, A. R., Scheinberg, K. and Toint, P. L. (1997b), ‘Recent progress in unconstrained nonlinear optimization without derivatives’, Math. Program. 79, 397–414.Google Scholar

Conn, A. R., Scheinberg, K. and Toint, P. L. (1998), A derivative free optimization algorithm in practice. In 7th AIAA/USAF/NASA/ISSMO Symposium on Multidisciplinary Analysis and Optimization, American Institute of Aeronautics and Astronautics.Google Scholar

Conn, A. R., Scheinberg, K. and Toint, P. L. (2001), DFO (derivative free optimization software). https://projects.coin-or.org/Dfo Google Scholar

Conn, A. R., Scheinberg, K. and Vicente, L. N. (2008a), ‘Geometry of interpolation sets in derivative free optimization’, Math. Program. 111, 141–172.Google Scholar

Conn, A. R., Scheinberg, K. and Vicente, L. N. (2008b), ‘Geometry of sample sets in derivative free optimization: Polynomial regression and underdetermined interpolation’, IMA J. Numer. Anal. 28, 721–748.Google Scholar

Conn, A. R., Scheinberg, K. and Vicente, L. N. (2009a), ‘Global convergence of general derivative-free trust-region algorithms to first and second order critical points’, SIAM J. Optim. 20, 387–415.Google Scholar

Conn, A. R., Scheinberg, K. and Vicente, L. N. (2009b), Introduction to Derivative-Free Optimization, SIAM.Google Scholar

Coope, I. D. and Price, C. J. (2000), ‘Frame based methods for unconstrained optimization’, J. Optim. Theory Appl. 107, 261–274.Google Scholar

Coope, I. D. and Tappenden, R. (2019), ‘Efficient calculation of regular simplex gradients’, Comput. Optim. Appl. 72, 561–588.Google Scholar

Custódio, A. L. and Madeira, J. F. A. (2015), ‘GLODS: Global and local optimization using direct search’, J. Global Optim. 62, 1–28.Google Scholar

Custódio, A. L. and Madeira, J. F. A. (2016), MultiGLODS: Global and local multiobjective optimization using direct search. Technical report, Universidade Nova de Lisboa.Google Scholar

Custódio, A. L. and Vicente, L. N. (2005), SID-PSM: A pattern search method guided by simplex derivatives for use in derivative-free optimization. http://www.mat.uc.pt/sid-psm Google Scholar

Custódio, A. L. and Vicente, L. N. (2007), ‘Using sampling and simplex derivatives in pattern search methods’, SIAM J. Optim. 18, 537–555.Google Scholar

Custódio, A. L., Dennis, J. E. Jr and Vicente, L. N. (2008), ‘Using simplex gradients of nonsmooth functions in direct search methods’, IMA J. Numer. Anal. 28, 770–784.Google Scholar

Custódio, A. L., Madeira, J. F. A., Vaz, A. I. F. and Vicente, L. N. (2011), ‘Direct multisearch for multiobjective optimization’, SIAM J. Optim. 21, 1109–1140.Google Scholar

Custódio, A. L., Rocha, H. and Vicente, L. N. (2009), ‘Incorporating minimum Frobenius norm models in direct search’, Comput. Optim. Appl. 46, 265–278.Google Scholar

Dai, L. (2016a), Convergence rates of finite difference stochastic approximation algorithms, I: General sampling. In Sensing and Analysis Technologies for Biomedical and Cognitive Applications (Dai, L. et al. , eds), SPIE.Google Scholar

Dai, L. (2016b), Convergence rates of finite difference stochastic approximation algorithms, II: Implementation via common random numbers. In Sensing and Analysis Technologies for Biomedical and Cognitive Applications (Dai, L. et al. , eds), SPIE.Google Scholar

D’Ambrosio, C., Nannicini, G. and Sartor, G. (2017), ‘MILP models for the selection of a small set of well-distributed points’, Oper. Res. Lett. 45, 46–52.Google Scholar

Davis, C. (1954), ‘Theory of positive linear dependence’, Amer. J. Math. 76, 733–746.Google Scholar

Davis, P. (2005), Michael J. D. Powell: An oral history. In History of Numerical Analysis and Scientific Computing Project, SIAM.Google Scholar

Davis, C. and Hare, W. L. (2013), ‘Exploiting known structures to approximate normal cones’, Math. Oper. Res. 38, 665–681.Google Scholar

De Leone, R., Gaudioso, M. and Grippo, L. (1984), ‘Stopping criteria for linesearch methods without derivatives’, Math. Program. 30, 285–300.Google Scholar

Dekel, O., Eldan, R. and Koren, T. (2015), Bandit smooth convex optimization: Improving the bias-variance tradeoff. In Advances in Neural Information Processing Systems 28 (Cortes, C. et al. , eds), Curran Associates, pp. 2926–2934.Google Scholar

Dener, A., Denchfield, A., Munson, T., Sarich, J., Wild, S. M., Benson, S. and Curfman McInnes, L. (2018), TAO 3.10 users manual. Technical Memorandum ANL/MCS-TM-322, Argonne National Laboratory.Google Scholar

Deng, G. and Ferris, M. C. (2009), ‘Variable-number sample-path optimization’, Math. Program. 117, 81–109.Google Scholar

Deng, G. and Ferris, M. C. (2006), Adaptation of the UOBYQA algorithm for noisy functions. In Proceedings of the 2006 Winter Simulation Conference, IEEE, pp. 312–319.Google Scholar

Dennis, J. E. Jr and Torczon, V. (1991), ‘Direct search methods on parallel machines’, SIAM J. Optim. 1, 448–474.Google Scholar

Dennis, J. E. Jr and Torczon, V. (1997), Managing approximation models in optimization. In Multidisciplinary Design Optimization: State-of-the-Art (Alexandrov, N. M. and Hussaini, M. Y., eds), SIAM, pp. 330–347.Google Scholar

Dennis, J. E. Jr and Woods, D. J. (1987), Optimization on microcomputers: The Nelder–Mead simplex algorithm. In New Computing Environments: Microcomputers in Large-Scale Computing (Wouk, A., ed.), SIAM, pp. 116–122.Google Scholar

Derman, C. (1956), ‘An application of Chung’s lemma to the Kiefer–Wolfowitz stochastic approximation procedure’, Ann. Math. Statist. 27, 532–536.Google Scholar

Diaz, G. I., Fokoue-Nkoutche, A., Nannicini, G. and Samulowitz, H. (2017), ‘An effective algorithm for hyperparameter optimization of neural networks’, IBM J. Res. Develop. 61, 9:1–9:11.Google Scholar

Diniz-Ehrhardt, M. A., Martínez, J. M. and Pedroso, L. G. (2011), ‘Derivative-free methods for nonlinear programming with general lower-level constraints’, Comput. Appl. Math. 30, 19–52.Google Scholar

Diniz-Ehrhardt, M. A., Martínez, J. M. and Raydan, M. (2008), ‘A derivative-free nonmonotone line-search technique for unconstrained optimization’, J. Comput. Appl. Math. 219, 383–397.Google Scholar

Dodangeh, M. and Vicente, L. N. (2016), ‘Worst case complexity of direct search under convexity’, Math. Program. 155, 307–332.Google Scholar

Dodangeh, M., Vicente, L. N. and Zhang, Z. (2016), ‘On the optimal order of worst case complexity of direct search’, Optim. Lett. 10, 699–708.Google Scholar

Dolan, E. D., Lewis, R. M. and Torczon, V. (2003), ‘On the local convergence of pattern search’, SIAM J. Optim. 14, 567–583.Google Scholar

Duchi, J. C., Jordan, M. I., Wainwright, M. J. and Wibisono, A. (2015), ‘Optimal rates for zero-order convex optimization: The power of two function evaluations’, IEEE Trans. Inform. Theory 61, 2788–2806.Google Scholar

Durrett, R. (2010), Probability: Theory and Examples, Cambridge University Press.Google Scholar

Dvurechensky, P., Gasnikov, A. and Gorbunov, E. (2018), An accelerated method for derivative-free smooth stochastic convex optimization. arXiv:1802.09022 Google Scholar

Echebest, N., Schuverdt, M. L. and Vignau, R. P. (2015), ‘An inexact restoration derivative-free filter method for nonlinear programming’, Comput. Appl. Math. 36, 693–718.Google Scholar

Ehrgott, M. (2005), Multicriteria Optimization, second edition, Springer.Google Scholar

Elster, C. and Neumaier, A. (1995), ‘A grid algorithm for bound constrained optimization of noisy functions’, IMA J. Numer. Anal. 15, 585–608.Google Scholar

Fabian, V. (1971), Stochastic approximation. In Optimizing Methods in Statistics, Elsevier, pp. 439–470.Google Scholar

Fasano, G., Liuzzi, G., Lucidi, S. and Rinaldi, F. (2014), ‘A linesearch-based derivative-free approach for nonsmooth constrained optimization’, SIAM J. Optim. 24, 959–992.Google Scholar

Fasano, G., Morales, J. L. and Nocedal, J. (2009), ‘On the geometry phase in model-based algorithms for derivative-free optimization’, Optim. Methods Software 24, 145–154.Google Scholar

Fermi, E. and Metropolis, N. (1952), Numerical solution of a minimum problem. Technical report LA-1492, Los Alamos Scientific Laboratory of the University of California.Google Scholar

Ferreira, P. S., Karas, E. W., Sachine, M. and Sobral, F. N. C. (2017), ‘Global convergence of a derivative-free inexact restoration filter algorithm for nonlinear programming’, Optimization 66, 271–292.Google Scholar

Finkel, D. E. and Kelley, C. T. (2004), Convergence analysis of the DIRECT algorithm. Technical report 04-28, Center for Research in Scientific Computation, North Carolina State University. https://projects.ncsu.edu/crsc/reports/ftp/pdf/crsc-tr04-28.pdf Google Scholar

Finkel, D. E. and Kelley, C. T. (2006), ‘Additive scaling and the DIRECT algorithm’, J. Global Optim. 36, 597–608.Google Scholar

Finkel, D. E. and Kelley, C. T. (2009), ‘Convergence analysis of sampling methods for perturbed Lipschitz functions’, Pacific J. Optim. 5, 339–350.Google Scholar

Flaxman, A., Kalai, A. and McMahan, B. (2005), Online convex optimization in the bandit setting: Gradient descent without a gradient. In 16th Annual ACM–SIAM Symposium on Discrete Algorithms (SODA ’05), ACM, pp. 385–394.Google Scholar

Fletcher, R. (1965), ‘Function minimization without evaluating derivatives: A review’, Comput. J. 8, 33–41.Google Scholar

Fletcher, R. (1987), Practical Methods of Optimization, second edition, Wiley.Google Scholar

Forrester, A., Sobester, A. and Keane, A. (2008), Engineering Design via Surrogate Modelling: A Practical Guide, Wiley.Google Scholar

Forth, S., Hovland, P., Phipps, E., Utke, J. & Walther, A., eds (2012), Recent Advances in Algorithmic Differentiation, Springer.Google Scholar

Frandi, E. and Papini, A. (2013), ‘Coordinate search algorithms in multilevel optimization’, Optim. Methods Software 29, 1020–1041.Google Scholar

Frazier, P. I. (2018), A tutorial on Bayesian optimization. arXiv:1807.02811 Google Scholar

Frimannslund, L. and Steihaug, T. (2007), ‘A generating set search method using curvature information’, Comput. Optim. Appl. 38, 105–121.Google Scholar

Frimannslund, L. and Steihaug, T. (2010), A new generating set search algorithm for partially separable functions. In 4th International Conference on Advanced Engineering Computing and Applications in Sciences (ADVCOMP 2010), IARIA, pp. 65–70.Google Scholar

Frimannslund, L. and Steihaug, T. (2011), ‘On a new method for derivative free optimization’, Internat. J. Adv. Software 4, 244–255.Google Scholar

Fu, M. C., Glover, F. W. and April, J. (2005), Simulation optimization: A review, new developments, and applications. In Proceedings of the 2005 Winter Simulation Conference, IEEE.Google Scholar

Gao, F. and Han, L. (2012), ‘Implementing the Nelder–Mead simplex algorithm with adaptive parameters’, Comput. Optim. Appl. 51, 259–277.Google Scholar

García-Palomares, U. M. and Rodríguez, J. F. (2002), ‘New sequential and parallel derivative-free algorithms for unconstrained minimization’, SIAM J. Optim. 13, 79–96.Google Scholar

García-Palomares, U. M., García-Urrea, I. J. and Rodríguez-Hernández, P. S. (2013), ‘On sequential and parallel non-monotone derivative-free algorithms for box constrained optimization’, Optim. Methods Software 28, 1233–1261.Google Scholar

Garmanjani, R. and Vicente, L. N. (2012), ‘Smoothing and worst-case complexity for direct-search methods in nonsmooth optimization’, IMA J. Numer. Anal. 33, 1008–1028.Google Scholar

Garmanjani, R., Jùdice, D. and Vicente, L. N. (2016), ‘Trust-region methods without using derivatives: Worst case complexity and the nonsmooth case’, SIAM J. Optim. 26, 1987–2011.Google Scholar

Gasnikov, A. V., Krymova, E. A., Lagunovskaya, A. A., Usmanova, I. N. and Fedorenko, F. A. (2017), ‘Stochastic online optimization: Single-point and multi-point non-linear multi-armed bandits; Convex and strongly-convex case’, Automat. Remote Control 78, 224–234.Google Scholar

Gerencsér, L. (1997), Rate of convergence of moments of Spall’s SPSA method. In Stochastic Differential and Difference Equations, Vol. 23 of Progress in Systems and Control Theory, Birkhäuser, pp. 67–75.Google Scholar

Ghadimi, S. (2019), ‘Conditional gradient type methods for composite nonlinear and stochastic optimization’, Math. Program. 173, 431–464.Google Scholar

Ghadimi, S. and Lan, G. (2013), ‘Stochastic first- and zeroth-order methods for nonconvex stochastic programming’, SIAM J. Optim. 23, 2341–2368.Google Scholar

Gill, P. E., Murray, W. and Wright, M. H. (1981), Practical Optimization, Academic Press.Google Scholar

Gill, P. E., Murray, W., Saunders, M. A. and Wright, M. H. (1983), ‘Computing forward-difference intervals for numerical optimization’, SIAM J. Sci. Statist. Comput. 4, 310–321.Google Scholar

Gilmore, P. and Kelley, C. T. (1995), ‘An implicit filtering algorithm for optimization of functions with many local minima’, SIAM J. Optim. 5, 269–285.Google Scholar

Glass, H. and Cooper, L. (1965), ‘Sequential search: A method for solving constrained optimization problems’, J. Assoc. Comput. Mach. 12, 71–82.Google Scholar

Golovin, D., Solnik, B., Moitra, S., Kochanski, G., Karro, J. and Sculley, D. (2017), Google Vizier: A service for black-box optimization. In 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’17), ACM, pp. 1487–1495.Google Scholar

Gonzaga, C. C., Karas, E. and Vanti, M. (2004), ‘A globally convergent filter method for nonlinear programming’, SIAM J. Optim. 14, 646–669.Google Scholar

Gould, N. I. M. and Toint, P. L. (2010), ‘Nonlinear programming without a penalty function or a filter’, Math. Program. 122, 155–196.Google Scholar

Gramacy, R. B. and Le Digabel, S. (2015), ‘The mesh adaptive direct search algorithm with treed Gaussian process surrogates’, Pacific J. Optim. 11, 419–447.Google Scholar

Grapiglia, G. N., Yuan, J. and Yuan, Y.-x. (2016), ‘A derivative-free trust-region algorithm for composite nonsmooth optimization’, Comput. Appl. Math. 35, 475–499.Google Scholar

Gratton, S. and Vicente, L. N. (2014), ‘A merit function approach for direct search’, SIAM J. Optim. 24, 1980–1998.Google Scholar

Gratton, S., Royer, C. W. and Vicente, L. N. (2016), ‘A second-order globally convergent direct-search method and its worst-case complexity’, Optimization 65, 1105–1128.Google Scholar

Gratton, S., Royer, C. W. and Vicente, L. N. (2019a), ‘A decoupled first/second-order steps technique for nonconvex nonlinear unconstrained optimization with improved complexity bounds’, Math. Program. doi:10.1007/s10107-018-1328-7 Google Scholar

Gratton, S., Royer, C. W., Vicente, L. N. and Zhang, Z. (2015), ‘Direct search based on probabilistic descent’, SIAM J. Optim. 25, 1515–1541.Google Scholar

Gratton, S., Royer, C. W., Vicente, L. N. and Zhang, Z. (2018), ‘Complexity and global rates of trust-region methods based on probabilistic models’, IMA J. Numer. Anal. 38, 1579–1597.Google Scholar

Gratton, S., Royer, C. W., Vicente, L. N. and Zhang, Z. (2019b), ‘Direct search based on probabilistic feasible descent for bound and linearly constrained problems’, Comput. Optim. Appl. 72, 525–559.Google Scholar

Gratton, S., Toint, P. L. and Tröltzsch, A. (2011), ‘An active-set trust-region method for derivative-free nonlinear bound-constrained optimization’, Optim. Methods Software 26, 873–894.Google Scholar

Gray, G. A. and Kolda, T. G. (2006), ‘Algorithm 856: APPSPACK 4.0: Asynchronous parallel pattern search for derivative-free optimization’, ACM Trans. Math. Software 32, 485–507.Google Scholar

Griewank, A. (2003), A mathematical view of automatic differentiation. In Acta Numerica, Vol. 12, Cambridge University Press, pp. 321–398.Google Scholar

Griewank, A. and Walther, A. (2008), Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation, SIAM.Google Scholar

Griewank, A., Walther, A., Fiege, S. and Bosse, T. (2016), ‘On Lipschitz optimization based on gray-box piecewise linearization’, Math. Program. 158, 383–415.Google Scholar

Grippo, L. and Rinaldi, F. (2014), ‘A class of derivative-free nonmonotone optimization algorithms employing coordinate rotations and gradient approximations’, Comput. Optim. Appl. 60, 1–33.Google Scholar

Grippo, L. and Sciandrone, M. (2007), ‘Nonmonotone derivative-free methods for nonlinear equations’, Comput. Optim. Appl. 37, 297–328.Google Scholar

Grippo, L., Lampariello, F. and Lucidi, S. (1988), ‘Global convergence and stabilization of unconstrained minimization methods without derivatives’, J. Optim. Theory Appl. 56, 385–406.Google Scholar

Gu, B., Huo, Z. and Huang, H. (2016), Zeroth-order asynchronous doubly stochastic algorithm with variance reduction. arXiv:1612.01425 Google Scholar

Gumma, E. A. E., Hashim, M. H. A. and Ali, M. M. (2014), ‘A derivative-free algorithm for linearly constrained optimization problems’, Comput. Optim. Appl. 57, 599–621.Google Scholar

Gunzburger, M. D., Webster, C. G. and Zhang, G. (2014), Stochastic finite element methods for partial differential equations with random input data. In Acta Numerica, Vol. 23, Cambridge University Press, pp. 521–650.Google Scholar

Gutmann, H.-M. (2001), ‘A radial basis function method for global optimization’, J. Global Optim. 19, 201–227.Google Scholar

Han, L. and Liu, G. (2004), ‘On the convergence of the UOBYQA method’, J. Appl. Math. Comput. 16, 125–142.Google Scholar

Hansen, P., Jaumard, B. and Lu, S.-H. (1991), ‘On the number of iterations of Piyavskii’s global optimization algorithm’, Math. Oper. Res. 16, 334–350.Google Scholar

Hare, W. L. (2014), ‘Numerical analysis of

${\mathcal{V}}{\mathcal{U}}$ -decomposition,

${\mathcal{U}}$ -gradient, and

${\mathcal{U}}$ -Hessian approximations’, SIAM J. Optim. 24, 1890–1913.Google Scholar

Hare, W. L. (2017), ‘Compositions of convex functions and fully linear models’, Optim. Lett. 11, 1217–1227.Google Scholar

Hare, W. L. and Lewis, A. S. (2005), ‘Estimating tangent and normal cones without calculus’, Math. Oper. Res. 30, 785–799.Google Scholar

Hare, W. L. and Lucet, Y. (2013), ‘Derivative-free optimization via proximal point methods’, J. Optim. Theory Appl. 160, 204–220.Google Scholar

Hare, W. L. and Macklem, M. (2013), ‘Derivative-free optimization methods for finite minimax problems’, Optim. Methods Software 28, 300–312.Google Scholar

Hare, W. L. and Nutini, J. (2013), ‘A derivative-free approximate gradient sampling algorithm for finite minimax problems’, Comput. Optim. Appl. 56, 1–38.Google Scholar

Hare, W. L., Planiden, C. and Sagastizábal, C. (2019), A derivative-free

${\mathcal{V}}{\mathcal{U}}$ -algorithm for convex finite-max problems. arXiv:1903.11184 Google Scholar

Hazan, E., Koren, T. and Levy, K. Y. (2014), Logistic regression: Tight bounds for stochastic and online optimization. In 27th Conference on Learning Theory (COLT 2014) (Balcan, M. F. et al. , eds). Proc. Mach. Learn. Res. 35, 197–209.Google Scholar

He, J., Verstak, A., Sosonkina, M. and Watson, L. T. (2009a), ‘Performance modeling and analysis of a massively parallel DIRECT, part 2’, Internat. J. High Performance Comput. Appl. 23, 29–41.Google Scholar

He, J., Verstak, A., Watson, L. T. and Sosonkina, M. (2007), ‘Design and implementation of a massively parallel version of DIRECT’, Comput. Optim. Appl. 40, 217–245.Google Scholar

He, J., Verstak, A., Watson, L. T. and Sosonkina, M. (2009b), ‘Performance modeling and analysis of a massively parallel DIRECT, part 1’, Internat. J. High Performance Comput. Appl. 23, 14–28.Google Scholar

He, J., Watson, L. T. and Sosonkina, M. (2009c), ‘Algorithm 897: VTDIRECT95: Serial and parallel codes for the global optimization algorithm DIRECT’, ACM Trans. Math. Software 36, 1–24.Google Scholar

Homem-de-Mello, T. and Bayraksan, G. (2014), ‘Monte Carlo sampling-based methods for stochastic optimization’, Surv. Oper. Res. Manag. Sci. 19, 56–85.Google Scholar

Hooke, R. and Jeeves, T. A. (1961), ‘Direct search” solution of numerical and statistical problems’, J. Assoc. Comput. Mach. 8, 212–229.Google Scholar

Hough, P. D. and Meza, J. C. (2002), ‘A class of trust-region methods for parallel optimization’, SIAM J. Optim. 13, 264–282.Google Scholar

Hough, P. D., Kolda, T. G. and Torczon, V. J. (2001), ‘Asynchronous parallel pattern search for nonlinear optimization’, SIAM J. Sci. Comput. 23, 134–156.Google Scholar

Hu, X., Prashanth, L. A., György, A. and Szepesvári, C. (2016), (Bandit) convex optimization with biased noisy gradient oracles. In 19th International Conference on Artificial Intelligence and Statistics. Proc. Mach. Learn. Res. 51, 819–828.Google Scholar

Hutchison, D. W. and Spall, J. C. (2013), Stochastic approximation. In Encyclopedia of Operations Research and Management Science, Springer, pp. 1470–1476.Google Scholar

Huyer, W. and Neumaier, A. (1999), ‘Global optimization by multilevel coordinate search’, J. Global Optim. 14, 331–355.Google Scholar

Huyer, W. and Neumaier, A. (2008), ‘SNOBFIT: Stable noisy optimization by branch and fit’, ACM Trans. Math. Software 35, 9:1–9:25.Google Scholar

Ilievski, I., Akhtar, T., Feng, J. and Shoemaker, C. (2017), Efficient hyperparameter optimization for deep learning algorithms using deterministic RBF surrogates. In 31st AAAI Conference on Artificial Intelligence (AAAI ’17), pp. 822–829.Google Scholar

Jamieson, K. G., Nowak, R. D. and Recht, B. (2012), Query complexity of derivative-free optimization. In Advances in Neural Information Processing Systems 25 (Pereira, F. et al. , eds), Curran Associates, pp. 2672–2680.Google Scholar

Jones, D. R., Perttunen, C. D. and Stuckman, B. E. (1993), ‘Lipschitzian optimization without the Lipschitz constant’, J. Optim. Theory Appl. 79, 157–181.Google Scholar

Jones, D. R., Schonlau, M. and Welch, W. J. (1998), ‘Efficient global optimization of expensive black-box functions’, J. Global Optim. 13, 455–492.Google Scholar

Karmanov, V. G. (1974), ‘Convergence estimates for iterative minimization methods’, USSR Comput. Math. Math. Phys. 14, 1–13.Google Scholar

Kelley, C. T. (1999a), ‘Detection and remediation of stagnation in the Nelder–Mead algorithm using a sufficient decrease condition’, SIAM J. Optim. 10, 43–55.Google Scholar

Kelley, C. T. (1999b), Iterative Methods for Optimization, SIAM.Google Scholar

Kelley, C. T. (2003), Implicit filtering and nonlinear least squares problems. In System Modeling and Optimization XX (Sachs, E. W. and Tichatschke, R., eds), Vol. 130 of IFIP: The International Federation for Information Processing, Springer, pp. 71–90.Google Scholar

Kelley, C. T. (2011), Implicit Filtering, SIAM.Google Scholar

Khan, K. A., Larson, J. and Wild, S. M. (2018), ‘Manifold sampling for optimization of nonconvex functions that are piecewise linear compositions of smooth components’, SIAM J. Optim. 28, 3001–3024.Google Scholar

Kiefer, J. and Wolfowitz, J. (1952), ‘Stochastic estimation of the maximum of a regression function’, Ann. Math. Statist. 22, 462–466.Google Scholar

Kim, S. and Ryu, J.-h. (2011), ‘A trust-region algorithm for bi-objective stochastic optimization’, Procedia Comput. Sci. 4, 1422–1430.Google Scholar

Kim, S. and Zhang, D. (2010), Convergence properties of direct search methods for stochastic optimization. In Proceedings of the 2010 Winter Simulation Conference, IEEE.Google Scholar

Kim, S., Pasupathy, R. and Henderson, S. G. (2015), A guide to sample average approximation. In Handbook of Simulation Optimization (Fu, M., ed.), Vol. 216, International Series in Operations Research & Management Science, Springer, pp. 207–243.Google Scholar

Kiwiel, K. C. (2010), ‘A nonderivative version of the gradient sampling algorithm for nonsmooth nonconvex optimization’, SIAM J. Optim. 20, 1983–1994.Google Scholar

Kleinberg, R., Slivkins, A. and Upfal, E. (2008), Multi-armed bandits in metric spaces. In 40th Annual ACM Symposium on Theory of Computing (STOC 2008), ACM, pp. 681–690.Google Scholar

Kleinman, N. L., Spall, J. C. and Naiman, D. Q. (1999), ‘Simulation-based optimization with stochastic approximation using common random numbers’, Manag. Sci. 45, 1570–1578.Google Scholar

Knowles, J. and Corne, D. (2002), On metrics for comparing nondominated sets. In Proceedings of the 2002 Congress on Evolutionary Computation, Vol. 1, IEEE, pp. 711–716.Google Scholar

Kolda, T. G., Lewis, R. M. and Torczon, V. (2006), ‘Stationarity results for generating set search for linearly constrained optimization’, SIAM J. Optim. 17, 943–968.Google Scholar

Kolda, T. G., Lewis, R. M. and Torczon, V. J. (2003), ‘Optimization by direct search: New perspectives on some classical and modern methods’, SIAM Review 45, 385–482.Google Scholar

Konečný, J. and Richtárik, P. (2014), Simple complexity analysis of simplified direct search. arXiv:1410.0390 Google Scholar

Kushner, H. and Yin, G. (2003), Stochastic Approximation and Recursive Algorithms and Applications, Springer.Google Scholar

Kushner, H. J. and Huang, H. (1979), ‘Rates of convergence for stochastic approximation type algorithms’, SIAM J. Control Optim. 17, 607–617.Google Scholar

La Cruz, W. (2014), ‘A projected derivative-free algorithm for nonlinear equations with convex constraints’, Optim. Methods Software 29, 24–41.Google Scholar

La Cruz, W., Martínez, J. M. and Raydan, M. (2006), ‘Spectral residual method without gradient information for solving large-scale nonlinear systems of equations’, Math. Comput. 75(255), 1429–1449.Google Scholar

Lagarias, J. C., Poonen, B. and Wright, M. H. (2012), ‘Convergence of the restricted Nelder–Mead algorithm in two dimensions’, SIAM J. Optim. 22, 501–532.Google Scholar

Lagarias, J. C., Reeds, J. A., Wright, M. H. and Wright, P. E. (1998), ‘Convergence properties of the Nelder–Mead simplex algorithm in low dimensions’, SIAM J. Optim. 9, 112–147.Google Scholar

Lai, T. L. and Robbins, H. (1985), ‘Asymptotically efficient adaptive allocation rules’, Adv. Appl. Math. 6, 4–22.Google Scholar

Larson, J. and Billups, S. C. (2016), ‘Stochastic derivative-free optimization using a trust region framework’, Comput. Optim. Appl. 64, 619–645.Google Scholar

Larson, J. and Wild, S. M. (2016), ‘A batch, derivative-free algorithm for finding multiple local minima’, Optim. Engng 17, 205–228.Google Scholar

Larson, J. and Wild, S. M. (2018), ‘Asynchronously parallel optimization solver for finding multiple minima’, Math. Program. Comput. 10, 303–332.Google Scholar

Larson, J., Leyffer, S., Palkar, P. and Wild, S. M. (2019), A method for convex black-box integer global optimization. arXiv:1903.11366 Google Scholar

Larson, J., Menickelly, M. and Wild, S. M. (2016), ‘Manifold sampling for

$\ell _{1}$ nonconvex optimization’, SIAM J. Optim. 26, 2540–2563.Google Scholar

Latorre, V., Habal, H., Graeb, H. and Lucidi, S. (2019), ‘Derivative free methodologies for circuit worst case analysis’, Optim. Lett. doi:10.1007/s11590-018-1364-5 Google Scholar

Lazar, M. and Jarre, F. (2016), ‘Calibration by optimization without using derivatives’, Optim. Engng 17, 833–860.Google Scholar

Le Digabel, S. (2011), ‘Algorithm 909: NOMAD: Nonlinear optimization with the MADS algorithm’, ACM Trans. Math. Software 37, 44:1–44:15.Google Scholar

Le Digabel, S. and Wild, S. M. (2015), A taxonomy of constraints in black-box simulation-based optimization. arXiv:1505.07881 Google Scholar

Le Gratiet, L. and Cannamela, C. (2015), ‘Cokriging-based sequential design strategies using fast cross-validation techniques for multi-fidelity computer codes’, Technometrics 57, 418–427.Google Scholar

L’Ecuyer, P. and Yin, G. (1998), ‘Budget-dependent convergence rate of stochastic approximation’, SIAM J. Optim. 8, 217–247.Google Scholar

Lee, H., Gramacy, R. B., Linkletter, C. and Gray, G. A. (2011), ‘Optimization subject to hidden constraints via statistical emulation’, Pacific J. Optim. 7, 467–478.Google Scholar

Lemarechal, C. & Mifflin, R., eds (1978), Nonsmooth Optimization: Proceedings of an IIASA Workshop, Pergamon Press.Google Scholar

Levenberg, K. (1944), ‘A method for the solution of certain non-linear problems in least squares’, Quart. Appl. Math. 2, 164–168.Google Scholar

Lewis, R. M. and Torczon, V. (1999), ‘Pattern search algorithms for bound constrained minimization’, SIAM J. Optim. 9, 1082–1099.Google Scholar

Lewis, R. M. and Torczon, V. (2000), ‘Pattern search methods for linearly constrained minimization’, SIAM J. Optim. 10, 917–941.Google Scholar

Lewis, R. M. and Torczon, V. (2002), ‘A globally convergent augmented Lagrangian pattern search algorithm for optimization with general constraints and simple bounds’, SIAM J. Optim. 12, 1075–1089.Google Scholar

Lewis, R. M. and Torczon, V. (2010), A direct search approach to nonlinear programming problems using an augmented Lagrangian method with explicit treatment of the linear constraints. Technical report WM-CS-2010-01, Department of Computer Science, College of William and Mary.Google Scholar

Lewis, R. M., Torczon, V. and Trosset, M. W. (2000), ‘Direct search methods: Then and now’, J. Comput. Appl. Math. 124, 191–207.Google Scholar

Leyffer, S. (2015), ‘It’s to solve problems: An interview with Roger Fletcher’, Optima 99, 1–6.Google Scholar

Li, D.-H. and Fukushima, M. (2000), ‘A derivative-free line search and global convergence of Broyden-like method for nonlinear equations’, Optim. Methods Software 13, 181–201.Google Scholar

Li, Q. and Li, D.-H. (2011), ‘A class of derivative-free methods for large-scale nonlinear monotone equations’, IMA J. Numer. Anal. 31, 1625–1635.Google Scholar

Liu, Q., Zeng, J. and Yang, G. (2015), ‘MrDIRECT: A multilevel robust DIRECT algorithm for global optimization problems’, J. Global Optim. 62, 205–227.Google Scholar

Liu, S., Kailkhura, B., Chen, P.-Y., Ting, P., Chang, S. and Amini, L. (2018), Zeroth-order stochastic variance reduction for nonconvex optimization. In Advances in Neural Information Processing Systems 31 (Bengio, S. et al. , eds), Curran Associates, pp. 3731–3741.Google Scholar

Liuzzi, G. and Lucidi, S. (2009), ‘A derivative-free algorithm for inequality constrained nonlinear programming via smoothing of an

$\ell _{\infty }$ penalty function’, SIAM J. Optim. 20, 1–29.Google Scholar

Liuzzi, G., Lucidi, S. and Rinaldi, F. (2011), ‘Derivative-free methods for bound constrained mixed-integer optimization’, Comput. Optim. Appl. 53, 505–526.Google Scholar

Liuzzi, G., Lucidi, S. and Rinaldi, F. (2015), ‘Derivative-free methods for mixed-integer constrained optimization problems’, J. Optim. Theory Appl. 164, 933–965.Google Scholar

Liuzzi, G., Lucidi, S. and Rinaldi, F. (2016), ‘A derivative-free approach to constrained multiobjective nonsmooth optimization’, SIAM J. Optim. 26, 2744–2774.Google Scholar

Liuzzi, G., Lucidi, S. and Rinaldi, F. (2018), An algorithmic framework based on primitive directions and nonmonotone line searches for black box problems with integer variables. Report 6471, Optimization Online. http://www.optimization-online.org/DB_HTML/2018/02/6471.html Google Scholar

Liuzzi, G., Lucidi, S. and Sciandrone, M. (2006), ‘A derivative-free algorithm for linearly constrained finite minimax problems’, SIAM J. Optim. 16, 1054–1075.Google Scholar

Liuzzi, G., Lucidi, S. and Sciandrone, M. (2010), ‘Sequential penalty derivative-free methods for nonlinear constrained optimization’, SIAM J. Optim. 20, 2614–2635.Google Scholar

Lucidi, S. and Sciandrone, M. (2002a), ‘A derivative-free algorithm for bound constrained optimization’, Comput. Optim. Appl. 21, 119–142.Google Scholar

Lucidi, S. and Sciandrone, M. (2002b), ‘On the global convergence of derivative-free methods for unconstrained optimization’, SIAM J. Optim. 13, 97–116.Google Scholar

Lucidi, S., Sciandrone, M. and Tseng, P. (2002), ‘Objective-derivative-free methods for constrained optimization’, Math. Program. 92, 37–59.Google Scholar

Ma, J. and Zhang, X. (2009), ‘Pattern search methods for finite minimax problems’, J. Appl. Math. Comput. 32, 491–506.Google Scholar

Madsen, K. (1975), Minimax solution of non-linear equations without calculating derivatives. In Nondifferentiable Optimization (Balinski, M. L. and Wolfe, P., eds), Vol. 3 of Mathematical Programming Studies, Springer, pp. 110–126.Google Scholar

Maggiar, A., Wächter, A., Dolinskaya, I. S. and Staum, J. (2018), ‘A derivative-free trust-region algorithm for the optimization of functions smoothed via Gaussian convolution using adaptive multiple importance sampling’, SIAM J. Optim. 28, 1478–1507.Google Scholar

Marazzi, M. and Nocedal, J. (2002), ‘Wedge trust region methods for derivative free optimization’, Math. Program. 91, 289–305.Google Scholar

March, A. and Willcox, K. (2012), ‘Provably convergent multifidelity optimization algorithm not requiring high-fidelity derivatives’, AIAA J. 50, 1079–1089.Google Scholar

Marquardt, D. W. (1963), ‘An algorithm for least-squares estimation of nonlinear parameters’, J. Soc. Ind. Appl. Math. 11, 431–441.Google Scholar

Martínez, J. M. and Sobral, F. N. C. (2012), ‘Constrained derivative-free optimization on thin domains’, J. Global Optim. 56, 1217–1232.Google Scholar

Matyas, J. (1965), ‘Random optimization’, Automat. Remote Control 26, 246–253.Google Scholar

May, J. H. (1974), Linearly constrained nonlinear programming: A solution method that does not require analytic derivatives. PhD thesis, Yale University.Google Scholar

May, J. H. (1979), ‘Solving nonlinear programs without using analytic derivatives’, Oper. Res. 27, 457–484.Google Scholar

McKinney, R. L. (1962), ‘Positive bases for linear spaces’, Trans. Amer. Math. Soc. 103, 131–131.Google Scholar

McKinnon, K. I. M. (1998), ‘Convergence of the Nelder–Mead simplex method to a nonstationary point’, SIAM J. Optim. 9, 148–158.Google Scholar

Menickelly, M. (2017), Random models in nonlinear optimization. PhD thesis, Lehigh University.Google Scholar

Menickelly, M. and Wild, S. M. (2019), ‘Derivative-free robust optimization by outer approximations’, Math. Program. doi:10.1007/s10107-018-1326-9 Google Scholar

Mersha, A. G. and Dempe, S. (2011), ‘Direct search algorithm for bilevel programming problems’, Comput. Optim. Appl. 49, 1–15.Google Scholar

Mifflin, R. (1975), ‘A superlinearly convergent algorithm for minimization without evaluating derivatives’, Math. Program. 9, 100–117.Google Scholar

Mockus, J. (1989), Bayesian Approach to Global Optimization: Theory and Applications, Springer.Google Scholar

Moré, J. J. (1978), The Levenberg–Marquardt algorithm: Implementation and theory. In Numerical Analysis: Dundee 1977 (Watson, G. A., ed.), Vol. 630 of Lecture Notes in Mathematics, Springer, pp. 105–116.Google Scholar

Moré, J. J. and Sorensen, D. C. (1983), ‘Computing a trust region step’, SIAM J. Sci. Statist. Comput. 4, 553–572.Google Scholar

Moré, J. J. and Wild, S. M. (2009), ‘Benchmarking derivative-free optimization algorithms’, SIAM J. Optim. 20, 172–191.Google Scholar

Moré, J. J. and Wild, S. M. (2011), ‘Estimating computational noise’, SIAM J. Sci. Comput. 33, 1292–1314.Google Scholar

Moré, J. J. and Wild, S. M. (2012), ‘Estimating derivatives of noisy simulations’, ACM Trans. Math. Software 38, 19:1–19:21.Google Scholar

Moré, J. J. and Wild, S. M. (2014), ‘Do you trust derivatives or differences?’, J. Comput. Phys. 273, 268–277.Google Scholar

Morini, B., Porcelli, M. and Toint, P. L. (2018), ‘Approximate norm descent methods for constrained nonlinear systems’, Math. Comput. 87(311), 1327–1351.Google Scholar

Müller, J. (2016), ‘MISO: Mixed-integer surrogate optimization framework’, Optim. Engng 17, 177–203.Google Scholar

Müller, J. and Day, M. (2019), ‘Surrogate optimization of computationally expensive black-box problems with hidden constraints’, INFORMS J. Comput., to appear.Google Scholar

Müller, J. and Woodbury, J. D. (2017), ‘GOSAC: Global optimization with surrogate approximation of constraints’, J. Global Optim. 69, 117–136.Google Scholar

Müller, J., Shoemaker, C. A. and Piché, R. (2013a), ‘SO-I: A surrogate model algorithm for expensive nonlinear integer programming problems including global optimization applications’, J. Global Optim. 59, 865–889.Google Scholar

Müller, J., Shoemaker, C. A. and Piché, R. (2013b), ‘SO-MI: A surrogate model algorithm for computationally expensive nonlinear mixed-integer black-box global optimization problems’, Comput. Oper. Res. 40, 1383–1400.Google Scholar

Munos, R. (2011), Optimistic optimization of a deterministic function without the knowledge of its smoothness. In Advances in Neural Information Processing Systems 24 (Shawe-Taylor, J. et al. , eds), Curran Associates, pp. 783–791.Google Scholar

Nash, S. G. (2000), ‘A multigrid approach to discretized optimization problems’, Optim. Methods Software 14, 99–116.Google Scholar

Nazareth, L. and Tseng, P. (2002), ‘Gilding the lily: A variant of the Nelder–Mead algorithm based on golden-section search’, Comput. Optim. Appl. 22, 133–144.Google Scholar

Nelder, J. A. and Mead, R. (1965), ‘A simplex method for function minimization’, Comput. J. 7, 308–313.Google Scholar

Nesterov, Y. (2004), Introductory Lectures on Convex Optimization: A Basic Course, Vol. 87 of Applied Optimization, Springer.Google Scholar

Nesterov, Y. and Spokoiny, V. (2017), ‘Random gradient-free minimization of convex functions’, Found. Comput. Math. 17, 527–566.Google Scholar

Neumaier, A. (2004), Complete search in continuous global optimization and constraint satisfaction. In Acta Numerica, Vol. 13, Cambridge University Press, pp. 271–369.Google Scholar

Neumaier, A., Fendl, H., Schilly, H. and Leitner, T. (2011), ‘VXQR: Derivative-free unconstrained optimization based on QR factorizations’, Soft Comput. 15, 2287–2298.Google Scholar

Newby, E. and Ali, M. M. (2015), ‘A trust-region-based derivative free algorithm for mixed integer programming’, Comput. Optim. Appl. 60, 199–229.Google Scholar

Oeuvray, R. (2005), Trust-region methods based on radial basis functions with application to biomedical imaging. PhD thesis, EPFL.Google Scholar

Oeuvray, R. and Bierlaire, M. (2009), ‘BOOSTERS: A derivative-free algorithm based on radial basis functions’, Internat. J. Model. Simul. 29, 26–36.Google Scholar

Olsson, P.-M. (2014), Methods for network optimization and parallel derivative-free optimization. PhD thesis, Linköping University.Google Scholar

Omojokun, E. O. (1989), Trust region algorithms for optimization with nonlinear equality and inequality constraints. PhD thesis, University of Colorado at Boulder.Google Scholar

Paquette, C. and Scheinberg, K. (2018), A stochastic line search method with convergence rate analysis. arXiv:1807.07994 Google Scholar

Pasupathy, R. (2010), ‘On choosing parameters in retrospective-approximation algorithms for stochastic root finding and simulation optimization’, Oper. Res. 58, 889–901.Google Scholar

Pasupathy, R., Glynn, P., Ghosh, S. and Hashemi, F. S. (2018), ‘On sampling rates in simulation-based recursions’, SIAM J. Optim. 28, 45–73.Google Scholar

Patel, N. R., Smith, R. L. and Zabinsky, Z. B. (1989), ‘Pure adaptive search in Monte Carlo optimization’, Math. Program. 43, 317–328.Google Scholar

Peckham, G. (1970), ‘A new method for minimising a sum of squares without calculating gradients’, Comput. J. 13, 418–420.Google Scholar

Picheny, V., Gramacy, R. B., Wild, S. M. and Le Digabel, S. (2016), Bayesian optimization under mixed constraints with a slack-variable augmented Lagrangian. In Advances in Neural Information Processing Systems 29 (Lee, D. D. et al. , eds), Curran Associates, pp. 1435–1443.Google Scholar

Plantenga, T. D. (2009), HOPSPACK 2.0 user manual. Technical report SAND2009-6265, Sandia National Laboratories.Google Scholar

Polak, E. (1971), Computational Methods in Optimization: A Unified Approach, Academic Press.Google Scholar

Polak, E. and Wetter, M. (2006), ‘Precision control for generalized pattern search algorithms with adaptive precision function evaluations’, SIAM J. Optim. 16, 650–669.Google Scholar

Polyak, B. T. (1987), Introduction to Optimization, Optimization Software.Google Scholar

Porcelli, M. and Toint, P. L. (2017), ‘BFO: A trainable derivative-free brute force optimizer for nonlinear bound-constrained optimization and equilibrium computations with continuous and discrete variables’, ACM Trans. Math. Software 44, 1–25.Google Scholar

Pourmohamad, T. (2016), Combining multivariate stochastic process models with filter methods for constrained optimization. PhD thesis, UC Santa Cruz.Google Scholar

Powell, M. J. D. (1964), ‘An efficient method for finding the minimum of a function of several variables without calculating derivatives’, Comput. J. 7, 155–162.Google Scholar

Powell, M. J. D. (1965), ‘A method for minimizing a sum of squares of non-linear functions without calculating derivatives’, Comput. J. 7, 303–307.Google Scholar

Powell, M. J. D. (1975), ‘A view of unconstrained minimization algorithms that do not require derivatives’, ACM Trans. Math. Software 1, 97–107.Google Scholar

Powell, M. J. D. (1994), A direct search optimization method that models the objective and constraint functions by linear interpolation. In Advances in Optimization and Numerical Analysis (Gomez, S. and Hennart, J. P., eds), Vol. 275 of Mathematics and its Applications, Springer, pp. 51–67.Google Scholar

Powell, M. J. D. (1997), Trust region calculations revisited. In Numerical Analysis 1997 (Griffiths, D. F. et al. , eds), Vol. 380 of Pitman Research Notes in Mathematics, Addison Wesley Longman, pp. 193–211.Google Scholar

Powell, M. J. D. (1998a), Direct search algorithms for optimization calculations. In Acta Numerica, Vol. 7, Cambridge University Press, pp. 287–336.Google Scholar

Powell, M. J. D. (1998b), The use of band matrices for second derivative approximations in trust region algorithms. In Advances in Nonlinear Programming (Yuan, Y., ed.), Vol. 14 of Applied Optimization, Springer, pp. 3–28.Google Scholar

Powell, M. J. D. (2001), ‘On the Lagrange functions of quadratic models that are defined by interpolation’, Optim. Methods Software 16, 289–309.Google Scholar

Powell, M. J. D. (2002), ‘UOBYQA: Unconstrained optimization by quadratic approximation’, Math. Program. 92, 555–582.Google Scholar

Powell, M. J. D. (2003), ‘On trust region methods for unconstrained minimization without derivatives’, Math. Program. 97, 605–623.Google Scholar

Powell, M. J. D. (2004a), ‘Least Frobenius norm updating of quadratic models that satisfy interpolation conditions’, Math. Program. 100, 183–215.Google Scholar

Powell, M. J. D. (2004b), ‘On the use of quadratic models in unconstrained minimization without derivatives’, Optim. Methods Software 19, 399–411.Google Scholar

Powell, M. J. D. (2004c), On updating the inverse of a KKT matrix. Technical report DAMTP 2004/NA01, University of Cambridge.Google Scholar

Powell, M. J. D. (2006), The NEWUOA software for unconstrained optimization without derivatives. In Large-Scale Nonlinear Optimization (Di Pillo, G. and Roma, M., eds), Vol. 83 of Nonconvex Optimization and its Applications, Springer, pp. 255–297.Google Scholar

Powell, M. J. D. (2007), A view of algorithms for optimization without derivatives. Technical report DAMTP 2007/NA03, University of Cambridge.Google Scholar

Powell, M. J. D. (2008), ‘Developments of NEWUOA for minimization without derivatives’, IMA J. Numer. Anal. 28, 649–664.Google Scholar

Powell, M. J. D. (2009), The BOBYQA algorithm for bound constrained optimization without derivatives. Technical report DAMTP 2009/NA06, University of Cambridge.Google Scholar

Powell, M. J. D. (2010), ‘On the convergence of a wide range of trust region methods for unconstrained optimization’, IMA J. Numer. Anal. 30, 289–301.Google Scholar

Powell, M. J. D. (2012), ‘On the convergence of trust region algorithms for unconstrained minimization without derivatives’, Comput. Optim. Appl. 53, 527–555.Google Scholar

Powell, M. J. D. (2013), ‘Beyond symmetric Broyden for updating quadratic models in minimization without derivatives’, Math. Program. 138, 475–500.Google Scholar

Powell, M. J. D. (2015), ‘On fast trust region methods for quadratic models with linear constraints’, Math. Program. Comput. 7, 237–267.Google Scholar

Press, W. H., Teukolsky, S. A., Vetterling, W. T. and Flannery, B. P. (2007), Numerical Recipes in Fortran: The Art of Scientific Computing, third edition, Cambridge University Press.Google Scholar

Price, C. J. and Toint, P. L. (2006), ‘Exploiting problem structure in pattern search methods for unconstrained optimization’, Optim. Methods Software 21, 479–491.Google Scholar

Price, C. J., Coope, I. D. and Byatt, D. (2002), ‘A convergent variant of the Nelder–Mead algorithm’, J. Optim. Theory Appl. 113, 5–19.Google Scholar

Ralston, M. L. and Jennrich, R. I. (1978), ‘Dud: A derivative-free algorithm for nonlinear least squares’, Technometrics 20, 7–14.Google Scholar

Rashid, K., Ambani, S. and Cetinkaya, E. (2012), ‘An adaptive multiquadric radial basis function method for expensive black-box mixed-integer nonlinear constrained optimization’, Engng Optim. 45, 185–206.Google Scholar

Rastrigin, L. A. (1963), ‘The convergence of the random search method in the extremal control of many-parameter system’, Automat. Remote Control 24, 1337–1342.Google Scholar

Reddi, S. J., Hefny, A., Sra, S., Poczos, B. and Smola, A. (2016), Stochastic variance reduction for nonconvex optimization. In 33rd International Conference on Machine Learning (Balcan, M. F. and Weinberger, K. Q., eds). Proc. Mach. Learn. Res. 48, 314–323.Google Scholar

Regier, J., Jordan, M. I. and McAuliffe, J. (2017), Fast black-box variational inference through stochastic trust-region optimization. In Advances in Neural Information Processing Systems 30 (Guyon, I. et al. , eds), Curran Associates, pp. 2402–2411.Google Scholar

Regis, R. G. (2013), ‘Constrained optimization by radial basis function interpolation for high-dimensional expensive black-box problems with infeasible initial points’, Engng Optim. 46, 218–243.Google Scholar

Regis, R. G. (2015), ‘The calculus of simplex gradients’, Optim. Lett. 9, 845–865.Google Scholar

Regis, R. G. (2016), ‘On the properties of positive spanning sets and positive bases’, Optim. Engng 17, 229–262.Google Scholar

Regis, R. G. and Shoemaker, C. A. (2007), ‘A stochastic radial basis function method for the global optimization of expensive functions’, INFORMS J. Comput. 19, 457–509.Google Scholar

Regis, R. G. and Wild, S. M. (2017), ‘CONORBIT: Constrained optimization by radial basis function interpolation in trust regions’, Optim. Methods Software 32, 552–580.Google Scholar

Rinnooy Kan, A. H. G. and Timmer, G. T. (1987a), ‘Stochastic global optimization methods, I: Clustering methods’, Math. Program. 39, 27–56.Google Scholar

Rinnooy Kan, A. H. G. and Timmer, G. T. (1987b), ‘Stochastic global optimization methods, II: Multi level methods’, Math. Program. 39, 57–78.Google Scholar

Rios, L. M. and Sahinidis, N. V. (2013), ‘Derivative-free optimization: A review of algorithms and comparison of software implementations’, J. Global Optim. 56, 1247–1293.Google Scholar

Robbins, H. (1952), ‘Some aspects of the sequential design of experiments’, Bull. Amer. Math. Soc. 58, 527–536.Google Scholar

Robbins, H. and Monro, S. (1951), ‘A stochastic approximation method’, Ann. Math. Statist. 22, 400–407.Google Scholar

Rockafellar, R. and Wets, R. J.-B. (2009), Variational Analysis, Springer.Google Scholar

Rosenbrock, H. H. (1960), ‘An automatic method for finding the greatest or least value of a function’, Comput. J. 3, 175–184.Google Scholar

Ruppert, D. (1991), Stochastic approximation. In Handbook of Sequential Analysis (Ghosh, B. K. and Sen, P. K., eds), Vol. 118 of Statistics: A Series of Textbooks and Monographs, CRC Press, pp. 503–529.Google Scholar

Russo, D. and Van Roy, B. (2016), ‘An information-theoretic analysis of Thompson sampling’, J. Mach. Learn. Res. 17, 2442–2471.Google Scholar

Rykov, A. S. (1980), ‘Simplex direct search algorithms’, Automat. Remote Control 41, 784–793.Google Scholar

Ryu, J. H. and Kim, S. (2014), ‘A derivative-free trust-region method for biobjective optimization’, SIAM J. Optim. 24, 334–362.Google Scholar

Sacks, J. (1958), ‘Asymptotic distribution of stochastic approximation procedures’, Ann. Math. Statist. 29, 373–405.Google Scholar

Saha, A. and Tewari, A. (2011), Improved regret guarantees for online smooth convex optimization with bandit feedback. In 14th International Conference on Artifical Intelligence and Statistics (AISTATS). Proc. Mach. Learn. Res. 15, 636–642.Google Scholar

Sampaio, P. R. and Toint, P. L. (2015), ‘A derivative-free trust-funnel method for equality-constrained nonlinear optimization’, Comput. Optim. Appl. 61, 25–49.Google Scholar

Sampaio, P. R. and Toint, P. L. (2016), ‘Numerical experience with a derivative-free trust-funnel method for nonlinear optimization problems with general nonlinear constraints’, Optim. Methods Software 31, 511–534.Google Scholar

Sankaran, S., Audet, C. and Marsden, A. L. (2010), ‘A method for stochastic constrained optimization using derivative-free surrogate pattern search and collocation’, J. Comput. Phys. 229, 4664–4682.Google Scholar

Scheinberg, K. and Toint, P. L. (2010), ‘Self-correcting geometry in model-based algorithms for derivative-free unconstrained optimization’, SIAM J. Optim. 20, 3512–3532.Google Scholar

Shahriari, B., Swersky, K., Wang, Z., Adams, R. P. and de Freitas, N. (2016), ‘Taking the human out of the loop: A review of Bayesian optimization’, Proc. IEEE 104, 148–175.Google Scholar

Shamir, O. (2013), On the complexity of bandit and derivative-free stochastic convex optimization. In 26th Annual Conference on Learning Theory (COLT 2013) (Shalev-Shwartz, S. and Steinwart, I., eds). Proc. Mach. Learn. Res. 30, 3–24.Google Scholar

Shamir, O. (2017), ‘An optimal algorithm for bandit and zero-order convex optimization with two-point feedback’, J. Mach. Learn. Res. 18(52), 1–11.Google Scholar

Shashaani, S., Hashemi, F. S. and Pasupathy, R. (2018), ‘ASTRO-DF: A class of adaptive sampling trust-region algorithms for derivative-free stochastic optimization’, SIAM J. Optim. 28, 3145–3176.Google Scholar

Shashaani, S., Hunter, S. R. and Pasupathy, R. (2016), ASTRO-DF: Adaptive sampling trust-region optimization algorithms, heuristics, and numerical experience. In 2016 Winter Simulation Conference (WSC), IEEE.Google Scholar

Shoemaker, C. A. and Regis, R. G. (2003), MAPO: using a committee of algorithm-experts for parallel optimization of costly functions. In Proceedings of the ACM Symposium on Parallel Algorithms and Architectures, ACM, pp. 242–243.Google Scholar

Spall, J. C. (1992), ‘Multivariate stochastic approximation using a simultaneous perturbation gradient approximation’, IEEE Trans. Automat. Control 37, 332–341.Google Scholar

Spall, J. C. (2005), Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control, Wiley.Google Scholar

Spendley, W. (1969), Nonlinear least squares fitting using a modified simplex minimization method. In Optimization (Fletcher, R., ed.), Academic Press, pp. 259–270.Google Scholar

Spendley, W., Hext, G. R. and Himsworth, F. R. (1962), ‘Sequential application of simplex designs in optimisation and evolutionary operation’, Technometrics 4, 441–461.Google Scholar

Srebro, N., Sridharan, K. and Tewari, A. (2011), On the universality of online mirror descent. In Advances in Neural Information Processing Systems 24 (Shawe-Taylor, J. et al. , eds), Curran Associates, pp. 2645–2653.Google Scholar

Sriver, T. A., Chrissis, J. W. and Abramson, M. A. (2009), ‘Pattern search ranking and selection algorithms for mixed variable simulation-based optimization’, European J. Oper. Res. 198, 878–890.Google Scholar

Stich, S. U., Müller, C. L. and Gärtner, B. (2013), ‘Optimization of convex functions with random pursuit’, SIAM J. Optim. 23, 1284–1309.Google Scholar

Taddy, M., Lee, H. K. H., Gray, G. A. and Griffin, J. D. (2009), ‘Bayesian guided pattern search for robust local optimization’, Technometrics 51, 389–401.Google Scholar

Torczon, V. (1991), ‘On the convergence of the multidirectional search algorithm’, SIAM J. Optim. 1, 123–145.Google Scholar

Torczon, V. (1997), ‘On the convergence of pattern search algorithms’, SIAM J. Optim. 7, 1–25.Google Scholar

Törn, A. and Žilinskas, A. (1989), Global Optimization, Springer.Google Scholar

Tröltzsch, A. (2016), ‘A sequential quadratic programming algorithm for equality-constrained optimization without derivatives’, Optim. Lett. 10, 383–399.Google Scholar

Tseng, P. (1999), ‘Fortified-descent simplicial search method: A general approach’, SIAM J. Optim. 10, 269–288.Google Scholar

Valko, M., Carpentier, A. and Munos, R. (2013), Stochastic simultaneous optimistic optimization. In 30th International Conference on Machine Learning (ICML 13). Proc. Mach. Learn. Res. 28, 19–27.Google Scholar

Van Dyke, B. and Asaki, T. J. (2013), ‘Using QR decomposition to obtain a new instance of mesh adaptive direct search with uniformly distributed polling directions’, J. Optim. Theory Appl. 159, 805–821.Google Scholar

Vanden Berghen, F. (2004), CONDOR: A constrained, non-linear, derivative-free parallel optimizer for continuous, high computing load, noisy objective functions. PhD thesis, Université Libre de Bruxelles.Google Scholar

Vanden Berghen, F. and Bersini, H. (2005), ‘CONDOR: A new parallel, constrained extension of Powell’s UOBYQA algorithm: Experimental results and comparison with the DFO algorithm’, J. Comput. Appl. Math. 181, 157–175.Google Scholar

Verdério, A., Karas, E. W., Pedroso, L. G. and Scheinberg, K. (2017), ‘On the construction of quadratic models for derivative-free trust-region algorithms’, EURO J. Comput. Optim. 5, 501–527.Google Scholar

Vicente, L. N. (2013), ‘Worst case complexity of direct search’, EURO J. Comput. Optim. 1, 143–153.Google Scholar

Vicente, L. N. and Custódio, A. (2012), ‘Analysis of direct searches for discontinuous functions’, Math. Program. 133, 299–325.Google Scholar

Vu, K. K., D’Ambrosio, C., Hamadi, Y. and Liberti, L. (2016), ‘Surrogate-based methods for black-box optimization’, Internat. Trans. Oper. Res. 24, 393–424.Google Scholar

Wang, Y., Du, S. S., Balakrishnan, S. and Singh, A. (2018), Stochastic zeroth-order optimization in high dimensions. In 21st International Conference on Artificial Intelligence and Statistics (Storkey, A. and Perez-Cruz, F., eds). Proc. Mach. Learn. Res. 84, 1356–1365.Google Scholar

Wendland, H. (2005), Scattered Data Approximation, Cambridge Monographs on Applied and Computational Mathematics, Cambridge University Press.Google Scholar

Wild, S. M. (2008a), Derivative-free optimization algorithms for computationally expensive functions. PhD thesis, Cornell University.Google Scholar

Wild, S. M. (2008b), MNH: A derivative-free optimization algorithm using minimal norm Hessians. In Tenth Copper Mountain Conference on Iterative Methods. http://grandmaster.colorado.edu/∼copper/2008/SCWinners/Wild.pdf Google Scholar

Wild, S. M. (2017), Solving derivative-free nonlinear least squares problems with POUNDERS. In Advances and Trends in Optimization with Engineering Applications (Terlaky, T. et al. , eds), SIAM, pp. 529–540.Google Scholar

Wild, S. M. and Shoemaker, C. A. (2011), ‘Global convergence of radial basis function trust region derivative-free algorithms’, SIAM J. Optim. 21, 761–781.Google Scholar

Wild, S. M. and Shoemaker, C. A. (2013), ‘Global convergence of radial basis function trust-region algorithms for derivative-free optimization’, SIAM Review 55, 349–371.Google Scholar

Wild, S. M., Regis, R. G. and Shoemaker, C. A. (2008), ‘ORBIT: Optimization by radial basis function interpolation in trust-regions’, SIAM J. Sci. Comput. 30, 3197–3219.Google Scholar

Winfield, D. H. (1969) Function and functional optimization by interpolation in data tables. PhD thesis, Harvard University.Google Scholar

Winfield, D. H. (1973), ‘Function minimization by interpolation in a data table’, J. Inst. Math. Appl. 12, 339–347.Google Scholar

Woods, D. J. (1985), An interactive approach for solving multi-objective optimization problems. PhD thesis, Rice University.Google Scholar

Wright, M. H. (1995), Direct search methods: Once scorned, now respectable. In Numerical Analysis 1995 (Griffiths, D. F. and Watson, G. A., eds), Vol. 344 of Pitman Research Notes in Mathematics, Addison Wesley Longman, pp. 191–208.Google Scholar

Xiong, S., Qian, P. Z. G. and Wu, C. F. J. (2013), ‘Sequential design and analysis of high-accuracy and low-accuracy computer codes’, Technometrics 55, 37–46.Google Scholar

Xu, J. and Zikatanov, L. (2017), Algebraic multigrid methods. In Acta Numerica, Vol. 26, Cambridge University Press, pp. 591–721.Google Scholar

Yu, W.-C. (1979), ‘Positive basis and a class of direct search techniques’, Scientia Sinica, Special Issue of Mathematics 1, 53–67.Google Scholar

Yuan, Y.-X. (1985), ‘Conditions for convergence of trust region algorithms for nonsmooth optimization’, Math. Program. 31, 220–228.Google Scholar

Zabinsky, Z. B. and Smith, R. L. (1992), ‘Pure adaptive search in global optimization’, Math. Program. 53, 323–338.Google Scholar

Zhang, D. and Lin, G.-H. (2014), ‘Bilevel direct search method for leader–follower problems and application in health insurance’, Comput. Oper. Res. 41, 359–373.Google Scholar

Zhang, H. and Conn, A. R. (2012), ‘On the local convergence of a derivative-free algorithm for least-squares minimization’, Comput. Optim. Appl. 51, 481–507.Google Scholar

Zhang, H., Conn, A. R. and Scheinberg, K. (2010), ‘A derivative-free algorithm for least-squares minimization’, SIAM J. Optim. 20, 3555–3576.Google Scholar

Zhang, L., Yang, T., Jin, R. and Zhou, Z.-H. (2015), Online bandit learning for a special class of non-convex losses. In 29th AAAI Conference on Artificial Intelligence (AAAI ’15), AAAI Press, pp. 3158–3164.Google Scholar

Zhang, Z. (2014), ‘Sobolev seminorm of quadratic functions with applications to derivative-free optimization’, Math. Program. 146, 77–96.Google Scholar

Zhigljavsky, A. A. (1991), Theory of Global Random Search, Springer.Google Scholar

Article contents

Derivative-free optimization methods

Abstract

Access options

References

REFERENCES 12

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests