Analyzing randomness effects on the reliability of exploratory landscape analysis

Muñoz, Mario Andrés; Kirley, Michael; Smith-Miles, Kate

doi:10.1007/s11047-021-09847-1

Analyzing randomness effects on the reliability of exploratory landscape analysis

Published: 12 February 2021

Volume 21, pages 131–154, (2022)
Cite this article

Natural Computing Aims and scope Submit manuscript

550 Accesses
5 Citations
Explore all metrics

Abstract

The inherent difficulty of solving a continuous, static, bound-constrained and single-objective black-box optimization problem depends on the characteristics of the problem’s fitness landscape and the algorithm being used. Exploratory landscape analysis (ELA) uses numerical features generated via a sampling process of the search space to describe such characteristics. Despite their success in a number of applications, these features have limitations related with the computational costs associated with generating accurate results. Consequently, only approximations are available in practice which may be unreliable, leading to systemic errors. The overarching aim of this paper is to evaluate the reliability of five well-known ELA feature sets across multiple dimensions and sample sizes. For this purpose, we propose a comprehensive experimental methodology combining exploratory and statistical validation stages, which uses resampling techniques to minimize the sampling cost, and statistical significance tests to identify strengths and weaknesses of individual features. The data resulting from the methodology is collected and made available in the LEarning and OPtimization Archive of Research Data v1.0. The results show that instances of the same function can have feature values that are significantly different; hence, non-generalizable across instances, due to the effects produced by the boundary constraints. In addition, some landscape features under evaluation are highly volatile, and strongly susceptible to changes in sample size. Finally, the results show evidence of a curse of modality, meaning that the sample size should increase with the number of local optima.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An exhaustive review of the metaheuristic algorithms for search and optimization: taxonomy, applications, and open challenges

Article 09 April 2023

Evolutionary algorithms and their applications to engineering problems

Article Open access 16 March 2020

Puma optimizer (PO): a novel metaheuristic optimization algorithm and its application in machine learning

Article 19 January 2024

References

Alissa M, Sim K, Hart E (2019) Algorithm selection using deep learning without feature extraction. In: GECCO’19. ACM Press. https://doi.org/10.1145/3321707.3321845
Beck J, Freuder E (2004) Simple rules for low-knowledge algorithm selection. In: CPAIOR ’04, LNCS, vol 3011. Springer, pp 50–64. https://doi.org/10.1007/978-3-540-24664-0_4
Belkhir N, Dréo J, Savéant P, Schoenauer M (2016a) Feature based algorithm configuration: A case study with differential evolution. In: Parallel problem solving from nature—PPSN XIV. Springer, pp 156–166. https://doi.org/10.1007/978-3-319-45823-6_15
Belkhir N, Dréo J, Savéant P, Schoenauer M (2016b) Surrogate assisted feature computation for continuous problems. In: Sellmann M, Vanschoren J, Festa P (eds) Learning and intelligent optimization. Springer, Berlin, pp 17–31
Chapter Google Scholar
Belkhir N, Dréo J, Savéant P, Schoenauer M (2017) Per instance algorithm configuration of CMA-ES with limited budget. In: Proceedings of the genetic and evolutionary computation conference. ACM. https://doi.org/10.1145/3071178.3071343
Benjamini Y, Yekutieli D (2001) The control of the false discovery rate in multiple testing under dependency. Ann Stat 29(4):1165–1188
Article MathSciNet Google Scholar
Bischl B, Mersmann O, Trautmann H, PreußM (2012a) Algorithm selection based on exploratory landscape analysis and cost-sensitive learning. In: GECCO ’12. ACM, pp 313–320. https://doi.org/10.1145/2330163.2330209
Bischl B, Mersmann O, Trautmann H, Weihs C (2012b) Resampling methods for meta-model validation with recommendations for evolutionary computation. Evol Comput 20(2):249–275
Article Google Scholar
Crombecq K, Laermans E, Dhaene T (2011) Efficient space-filling and non-collapsing sequential design strategies for simulation-based modeling. Eur J Oper Res 214(3):683–696. https://doi.org/10.1016/j.ejor.2011.05.032
Article Google Scholar
Davidor Y (1991) Epistasis variance: a viewpoint on GA-hardness. In: Rawlins G (ed) FOGA I. Morgan Kauffmann, Burlington, pp 23–35
Google Scholar
Efron B, Tibshirani R (1993) An introduction to the bootstrap. Chapman & Hall, London
Book Google Scholar
Fonlupt C, Robilliard D, Preux P (1998) A bit-wise epistasis measure for binary search spaces. PPSN V LNCS 1498:47–56. https://doi.org/10.1007/BFb0056848
Article Google Scholar
Graff M, Poli R (2010) Practical performance models of algorithms in evolutionary program induction and other domains. Artif Intell 174:1254–1276. https://doi.org/10.1016/j.artint.2010.07.005
Article MathSciNet MATH Google Scholar
Groppe D, Urbach T, Kutas M (2011) Mass univariate analysis of event-related brain potentials/fields I: a critical tutorial review. Psychophysiology 48(12):1711–1725. https://doi.org/10.1111/j.1469-8986.2011.01273.x
Article Google Scholar
Hansen N, Auger A, Ros R, Finck S, Pošík P (2011a) Comparing results of 31 algorithms from the black-box optimization benchmarking BBOB-2009. In: GECCO ’11, pp 1689–1696. https://doi.org/10.1145/1830761.1830790
Hansen N, Ros R, Mauny N, Schoenauer M, Auger A (2011b) Impacts of invariance in search: when CMA-ES and PSO face ill-conditioned and non-separable problems. Appl Soft Comput 11(8):5755–5769. https://doi.org/10.1016/j.asoc.2011.03.001
Article Google Scholar
Hansen N, Auger A, Finck S, Ros R (2014) Real-parameter black-box optimization benchmarking BBOB-2010: experimental setup. Tech. Rep. RR-7215, INRIA. http://coco.lri.fr/downloads/download15.02/bbobdocexperiment.pdf
He J, Reeves C, Witt C, Yao X (2007) A note on problem difficulty measures in black-box optimization: classification, realizations and predictability. Evol Comput 15(4):435–443. https://doi.org/10.1162/evco.2007.15.4.435
Article Google Scholar
Hinkle D, Wiersma W, Jurs S (2003) Applied statistics for the behavioral sciences. Houghton Mifflin, Boston
Google Scholar
Jones T, Forrest S (1995) Fitness distance correlation as a measure of problem difficulty for genetic algorithms. In: Proceedings of the sixth international conference on genetic algorithms. Morgan Kaufmann Publishers Inc., pp 184–192
Kang Y, Hyndman R, Smith-Miles K (2017) Visualising forecasting algorithm performance using time series instance spaces. Int J Forecast 33(2):345–358. https://doi.org/10.1016/j.ijforecast.2016.09.004
Article Google Scholar
Kerschke P, Trautmann H (2019a) Automated algorithm selection on continuous black-box problems by combining exploratory landscape analysis and machine learning. Evol Comput 27(1):99–127. https://doi.org/10.1162/evco_a_00236
Article Google Scholar
Kerschke P, Trautmann H (2019b) Comprehensive feature-based landscape analysis of continuous and constrained optimization problems using the R-package flacco. In: Bauer N, Ickstadt K, Lübke K, Szepannek G, Trautmann H, Vichi M (eds) Applications in statistical computing—from music data analysis to industrial quality improvement, studies in classification, data analysis, and knowledge organization. Springer, Berlin, pp 93–123. https://doi.org/10.1007/978-3-030-25147-5_7
Chapter Google Scholar
Kerschke P, PreußM, Wessing S, Trautmann H (2016) Low-budget exploratory landscape analysis on multiple peaks models. In: GECCO ’16. ACM, New York, pp 229–236. https://doi.org/10.1145/2908812.2908845
Lunacek M, Whitley D (2006) The dispersion metric and the CMA evolution strategy. In: GECCO ’06. ACM, New York, pp 477–484. https://doi.org/10.1145/1143997.1144085
Malan K, Engelbrecht A (2014) Characterising the searchability of continuous optimisation problems for PSO. Swarm Intell 8(4):1–28. https://doi.org/10.1007/s11721-014-0099-x
Article Google Scholar
Marin J (2012) How landscape ruggedness influences the performance of real-coded algorithms: a comparative study. Soft Comput 16(4):683–698. https://doi.org/10.1007/s00500-011-0781-5
Article Google Scholar
Mersmann O, PreußM, Trautmann H (2010) Benchmarking evolutionary algorithms: towards exploratory landscape analysis. In: PPSN XI. LNCS, vol 6238. Springer, pp 73–82. https://doi.org/10.1007/978-3-642-15844-5_8
Mersmann O, Bischl B, Trautmann H, PreußM, Weihs C, Rudolph G (2011) Exploratory landscape analysis. In: GECCO ’11. ACM, pp 829–836. https://doi.org/10.1145/2001576.2001690
Miranda P, Prudéncio R, Pappa G (2017) H3ad: a hybrid hyper-heuristic for algorithm design. Inf Sci 414:340–354. https://doi.org/10.1016/j.ins.2017.05.029
Article Google Scholar
Morgan R, Gallagher M (2014) Sampling techniques and distance metrics in high dimensional continuous landscape analysis: limitations and improvements. IEEE Trans Evol Comput 18(3):456–461. https://doi.org/10.1109/TEVC.2013.2281521
Article Google Scholar
Muñoz M (2020) LEOPARD: LEarning and OPtimization Archive of Research Data, version 1.0. https://doi.org/10.6084/m9.figshare.c.5106758
Muñoz M, Smith-Miles K (2015) Effects of function translation and dimensionality reduction on landscape analysis. In: IEEE CEC ’15, pp 1336–1342. https://doi.org/10.1109/CEC.2015.7257043
Muñoz M, Smith-Miles K (2017) Performance analysis of continuous black-box optimization algorithms via footprints in instance space. Evol Comput 25(4):529–554. https://doi.org/10.1162/EVCO_a_00194
Article Google Scholar
Muñoz M, Smith-Miles K (2020) Generating new space-filling test instances for continuous black-box optimization. Evol Comput 28(3):379–404. https://doi.org/10.1162/evco_a_00262
Article Google Scholar
Muñoz M, Kirley M, Halgamuge S (2012) Landscape characterization of numerical optimization problems using biased scattered data. In: IEEE CEC ’12, pp 1–8. https://doi.org/10.1109/CEC.2012.6256490
Muñoz M, Kirley M, Halgamuge S (2015a) Exploratory landscape analysis of continuous space optimization problems using information content. IEEE Trans Evol Comput 19(1):74–87. https://doi.org/10.1109/TEVC.2014.2302006
Article Google Scholar
Muñoz M, Sun Y, Kirley M, Halgamuge S (2015b) Algorithm selection for black-box continuous optimization problems: a survey on methods and challenges. Inf Sci 317:224–245. https://doi.org/10.1016/j.ins.2015.05.010
Article Google Scholar
Müller C, Sbalzarini I (2011) Global characterization of the CEC 2005 fitness landscapes using fitness-distance analysis. In: Applications of evolutionary computation. LNCS, vol 6624. Springer, pp 294–303. https://doi.org/10.1007/978-3-642-20525-5_30
Naudts B, Suys D, Verschoren A (1997) Epistasis as a basic concept in formal landscape analysis. In: Bäck T (ed) Proceedings of the 7th international conference on genetic algorithms. Morgan Kaufmann, pp 65–72
Pošík P (2005) On the utility of linear transformations for population-based optimization algorithms. IFAC Proc Vol 38(1):281–286. https://doi.org/10.3182/20050703-6-CZ-1902.01125(16th IFAC World Congress)
Article Google Scholar
Renau Q, Dreo J, Doerr C, Doerr B (2019) Expressiveness and robustness of landscape features. In: GECCO’19. ACM Press. https://doi.org/10.1145/3319619.3326913
Renau Q, Doerr C, Dreo J, Doerr B (2020) Exploratory landscape analysis is strongly sensitive to the sampling strategy. In: Bäck T, Preuss M, Deutz A, Wang H, Doerr C, Emmerich M, Trautmann H (eds) Parallel problem solving from nature—PPSN XVI. Springer, Cham, pp 139–153
Rochet S, Slimane M, Venturini G (1996) Epistasis for real encoding in genetic algorithms. In: Australian and New Zealand conference on intelligent information systems, pp 268–271. https://doi.org/10.1109/ANZIIS.1996.573954
Rochet S, Venturini G, Slimane M, El Kharoubi E (1998) A critical and empirical study of epistasis measures for predicting GA performances: a summary. In: Third European conference on artificial evolution, pp 275–285. https://doi.org/10.1007/BFb0026607
Rosé H, Ebeling W, Asselmeyer T (1996) The density of states—a measure of the difficulty of optimisation problems. In: PPSN IV, LNCS, vol 1141. Springer, pp 208–217. https://doi.org/10.1007/3-540-61723-X_985
Sala R, Müller R (2020) Benchmarking for metaheuristic black-box optimization: perspectives and open challenges. In: 2020 IEEE Congress on Evolutionary Computation (CEC). IEEE, pp 1–8
Saleem S, Gallagher M, Wood I (2019) Direct feature evaluation in black-box optimization using problem transformations. Evol Comput 27(1):75–98. https://doi.org/10.1162/evco_a_00247
Article Google Scholar
Seo D, Moon B (2007) An information-theoretic analysis on the interactions of variables in combinatorial optimization problems. Evol Comput 15(2):169–198. https://doi.org/10.1162/evco.2007.15.2.169
Article Google Scholar
Škvorc U, Eftimov T, Korošec P (2020) Understanding the problem space in single-objective numerical optimization using exploratory landscape analysis. Appl Soft Comput 90:106138. https://doi.org/10.1016/j.asoc.2020.106138
Article Google Scholar
Smith-Miles K, Baatar D, Wreford B, Lewis R (2014) Towards objective measures of algorithm performance across instance space. Comput Oper Res 45:12–24. https://doi.org/10.1016/j.cor.2013.11.015
Article MathSciNet MATH Google Scholar
Stein M (1987) Large sample properties of simulations using latin hypercube sampling. Technometrics 29(2):143–151. https://doi.org/10.1080/00401706.1987.10488205
Article MathSciNet MATH Google Scholar
Storlie CB, Swiler LP, Helton JC, Sallaberry CJ (2009) Implementation and evaluation of nonparametric regression procedures for sensitivity analysis of computationally demanding models. Reliab Eng Syst Saf 94(11):1735–1763. https://doi.org/10.1016/j.ress.2009.05.007
Article Google Scholar
Stowell D, Plumbley M (2009) Fast multidimensional entropy estimation by k-d partitioning. IEEE Signal Process Lett 16(6):537–540. https://doi.org/10.1109/LSP.2009.2017346
Article Google Scholar
Tian W, Song J, Li Z, de Wilde P (2014) Bootstrap techniques for sensitivity analysis and model selection in building thermal performance analysis. Appl Energy 135:320–328. https://doi.org/10.1016/j.apenergy.2014.08.110
Article Google Scholar

Download references

Acknowledgements

We express our gratitude to the two reviewers and the guest editors for their thorough and valuable suggestions, which significantly improved this paper. We also acknowledge Saman K. Halgamuge for his feedback on earlier versions on this work.

Funding

Funding was provided by the Australian Research Council through the Australian Laureate Fellowship FL140100012, and The University of Melbourne through MIRS/MIFRS scholarships.

Author information

Authors and Affiliations

School of Mathematics and Statistics, The University of Melbourne, Parkville, VIC, 3010, Australia
Mario Andrés Muñoz & Kate Smith-Miles
School of Computer and Information Systems, The University of Melbourne, Parkville, VIC, 3010, Australia
Michael Kirley

Authors

Mario Andrés Muñoz
View author publications
You can also search for this author in PubMed Google Scholar
Michael Kirley
View author publications
You can also search for this author in PubMed Google Scholar
Kate Smith-Miles
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mario Andrés Muñoz.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Validation of the assumptions behind the experimental methodology

Our experimental methodology makes assumptions that can be summarized in two questions: (a) Since Latin Hyper-cube Samplin (LHS) is a type of stratified sampling, is the independence assumption still valid? (b) What are the differences between multiple uniformly distributed random samples, bootstrapping a single uniformly distributed random sample, and bootstrapping a single LHS, when calculating the variance of an estimate? To answer these questions, we have carried out two simple experiments that demonstrate that there is no practical difference on the results between taking multiple uniformly distributed random samples and bootstrapping a LHS. On the first experiment, we address the independence assumption, by calculating the magnitude of the auto-correlation with lags in the \(\left[ 1,\ 50\right]\) range, for data drawn from the \(\left[ 0,\ 1\right]\) interval. For this assumption to hold for LHS, the magnitudes of the auto-correlation should follow the same trend that for a uniformly distributed random sample and be close to zero, indicating that it is not possible to estimate the value of one point from another. We repeat this experiment 1000 times and average the results, which are presented in Fig. 16a for samples with \(\left\{ 200,600,1000\right\}\) points. Other than the descending trend for a sample of 200 points, which can be explained by the decrease in points in the sample for which the auto-correlation can be calculated, the results demonstrate that the independence assumption holds for a LHS in practice.

On the second experiment, we address the second question by estimating the variance of the mean from these three different sampling regimes, using data drawn from the \(\left[ 0,\ 1\right]\) interval. On the first one, called \(IID\left( n,N\right)\), we took N uniformly distributed random samples of n points. On the second one, called \(IID+B\left( n,N\right)\), we took one uniformly distributed sample of n points and bootstrapped it N times. On the third one, called \(LHS+B\left( n,N\right)\), we took one LHS of n points and bootstrapped it N times. Each sampling regime produced N mean estimates, from which the variance is calculated. The experiments are repeated 1000 times for all the combinations of \(\left\{ n,N\right\} =\left\{ 200,600,1000\right\}\). The results are shown in Fig. 16b as box-plots, which demonstrate that there is no practical difference between taking N uniformly distributed random samples and bootstrapping N times a single LHS.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Muñoz, M.A., Kirley, M. & Smith-Miles, K. Analyzing randomness effects on the reliability of exploratory landscape analysis. Nat Comput 21, 131–154 (2022). https://doi.org/10.1007/s11047-021-09847-1

Download citation

Accepted: 27 January 2021
Published: 12 February 2021
Issue Date: June 2022
DOI: https://doi.org/10.1007/s11047-021-09847-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Analyzing randomness effects on the reliability of exploratory landscape analysis

Abstract

Access this article

Similar content being viewed by others

An exhaustive review of the metaheuristic algorithms for search and optimization: taxonomy, applications, and open challenges

Evolutionary algorithms and their applications to engineering problems

Puma optimizer (PO): a novel metaheuristic optimization algorithm and its application in machine learning

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix: Validation of the assumptions behind the experimental methodology

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Analyzing randomness effects on the reliability of exploratory landscape analysis

Abstract

Access this article

Similar content being viewed by others

An exhaustive review of the metaheuristic algorithms for search and optimization: taxonomy, applications, and open challenges

Evolutionary algorithms and their applications to engineering problems

Puma optimizer (PO): a novel metaheuristic optimization algorithm and its application in machine learning

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix: Validation of the assumptions behind the experimental methodology

Appendix: Validation of the assumptions behind the experimental methodology

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation