Abstract
When developing optimisation algorithms, the focus often lies on obtaining an algorithm that is able to outperform other existing algorithms for some performance measure. It is not common practice to question the reasons for possible performance differences observed. These types of questions relate to evaluating the impact of the various heuristic parameters and often remain unanswered. In this paper, the focus is on gaining insight in the behaviour of a heuristic algorithm by investigating how the various elements operating within the algorithm correlate with performance, obtaining indications of which combinations work well and which do not, and how all these effects are influenced by the specific problem instance the algorithm is solving. We consider two approaches for analysing algorithm parameters and components—functional analysis of variance and multilevel regression analysis—and study the benefits of using both approaches jointly. We present the results of a combined methodology that is able to provide more insights than when the two approaches are used separately. The illustrative case studies in this paper analyse a large neighbourhood search algorithm applied to the vehicle routing problem with time windows and an iterated local search algorithm for the unrelated parallel machine scheduling problem with sequence-dependent setup times.
Similar content being viewed by others
Notes
We interpret a parameter setting as a set of values and included operators.
A newer implementation is recently introduced at https://github.com/automl/fanova. The two versions give similar analysis results. The reason why we use this older version is because it runs much faster in our experience, probably due to the different underlying choices of programming languages used in each version.
Increasing sample size will increase precision of the estimates, meaning their confidence intervals become narrower. Effects that are already significant will only become more significant. Whether or not an increased sample size will contribute much to the analysis is difficult to judge. As sample size increases, even the smallest effects become significant, but that does not make them important (Sullivan and Feinn 2012). In our case, a larger sample size did not alter analysis conclusions other than adding more precision. It did require substantially more time to fit the regression models, so we assessed the current sample size of 4000 scenarios to sufficiently represent the major variations in performance and to be practical in terms of time to fit the regression model.
When the observations are all positive continuous values, the logarithmic transformation is typically applied (Gelman and Hill 2006). However, the residual plot of the log-transformed values still shows increasing error variance, but not for the inverse values.
Since the problem instance characteristic Customers is a centred variable, it has both positive and negative values. This excludes the logarithmic and square root transformations since they would delete the negative values. The cube root transformation has the advantage of being able to deal with negative values and is therefore chosen.
References
Ansótegui, C., Sellmann, M., Tierney, K.: A gender-based genetic algorithm for the automatic configuration of algorithms. In: International Conference on Principles and Practice of Constraint Programming, pp. 142–157. Springer (2009)
Bartz-Beielstein, T., Parsopoulos, K.E., Vrahatis, M.N.: Design and analysis of optimization algorithms using computational statistics. Appl. Numer. Anal. Comput. Math. 1(2), 413–433 (2004)
Bertsimas, D.J., Simchi-Levi, D.: A new generation of vehicle routing research: robust algorithms, addressing uncertainty. Oper. Res. 44(2), 286 (1996)
Birattari, M.: Tuning Metaheuristics, Studies in Computational Intelligence, vol. 197. Springer, Berlin (2009)
Burke, E.K., Bykov, Y.: A late acceptance strategy in hill-climbing for exam timetabling problems. In: PATAT 2008 Conference, Montreal, Canada (2008)
Bykov, Y., Petrovic, S.: An initial study of a novel step counting hill climbing heuristic applied to timetabling problems. In: Proceedings of 6th Multidisciplinary International Scheduling Conference (MISTA 2013) (2013)
Chiarandini, M., Goegebeur, Y.: Mixed models for the analysis of optimization algorithms. Exp. Methods Anal. Optim. Algorithms 1, 225 (2010)
Corstjens, J., Caris, A., Depaire, B.: Explaining heuristic performance differences for vehicle routing problems with time windows. In: Kotsireas, S., Pardalos, P.M. (eds.) Learning and Intelligent Optimization. Lecture Notes in Computer Science. Springer, Berlin (2018). (in press)
Corstjens, J., Depaire, B., Caris, A., Sörensen, K.: A multilevel evaluation method for heuristics with an application to the VRPTW. Manuscript submitted for publication (2017)
Coy, S.P., Golden, B.L., Runger, G.C., Wasil, E.A.: Using experimental design to find effective parameter settings for heuristics. J. Heuristics 7(1), 77–97 (2001)
De Leeuw, J., Meijer, E., Goldstein, H.: Handbook of Multilevel Analysis. Springer, Berlin (2008)
Dietterich, T.: Overfitting and undercomputing in machine learning. ACM Comput. Surv. CSUR 27(3), 326–327 (1995)
Fawcett, C., Hoos, H.H.: Analysing differences between algorithm configurations through ablation. J. Heuristics 22, 1–28 (2015)
Gelman, A., Hill, J.: Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press, Cambridge (2006)
Hair, J.F., Anderson, R.E., Babin, B.J., Black, W.C.: Multivariate Data Analysis: A Global Perspective, vol. 7. Pearson, Upper Saddle River, NJ (2010)
Hooker, G.: Generalized functional anova diagnostics for high-dimensional functions of dependent variables. J. Comput. Graph. Stat. 16, 709–732 (2012)
Hooker, J.N.: Testing heuristics: we have it all wrong. J. Heuristics 1(1), 33–42 (1995)
Hutter, F., Hoos, H.H., Leyton-Brown, K.: Sequential model-based optimization for general algorithm configuration. In: Coello, C.A.C. (ed.) Learning and Intelligent Optimization, Lecture Notes in Computer Science, vol. 6683, pp. 507–523. Springer, Berlin (2011)
Hutter, F., Hoos, H.H., Leyton-Brown, K.: Identifying key algorithm parameters and instance features using forward selection. In: International Conference on Learning and Intelligent Optimization, pp. 364–381. Springer (2013)
Hutter, F., Hoos, H.H., Leyton-Brown, K.: An efficient approach for assessing hyperparameter importance. In: International Conference on Machine Learning, pp. 754–762 (2014)
Hutter, F., Hoos, H.H., Leyton-Brown, K., Stützle, T.: Paramils: an automatic algorithm configuration framework. J. Artif. Intell. Res. 36(1), 267–306 (2009)
Jones, Z., Linder, F.: Exploratory data analysis using random forests. In: Prepared for the 73rd Annual MPSA Conference (2015)
Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing. science 220(4598), 671–680 (1983)
Lawson, J., Erjavec, J.: Basic Experimental Strategies and Data Analysis for Science and Engineering. CRC Press, Boca Raton (2016)
Leyton-Brown, K., Nudelman, E., Shoham, Y.: Empirical hardness models: methodology and a case study on combinatorial auctions. J. ACM 56(4), 22 (2009)
López-Ibáñez, M., Dubois-Lacoste, J., Cáceres, L.P., Birattari, M., Stützle, T.: The IRACE package: iterated racing for automatic algorithm configuration. Oper. Res. Perspect. 3, 43–58 (2016)
Lourenço, H.R., Martin, O.C., Stützle, T.: Iterated local search. In: Glover, F.W., Kochenberger, G.A. (eds.) Handbook of Metaheuristics, pp. 320–353. Springer, Berlin (2003)
Montgomery, D.: Design and Analysis of Experiments, 8th edn. Wiley, New York (2012)
Moore, D.S., McCabe, G.P., Craig, B.A.: Introduction to the Practice of Statistics, 6th edn. W. H. Freeman, New York (2007)
Nannen, V., Eiben, A.E.: Relevance estimation and value calibration of evolutionary algorithm parameters. In: International Joint Conference on Artificial Intelligence, vol. 7, pp. 975–980 (2007)
PassMark Software: CPU benchmarks. https://www.cpubenchmark.net/ (2018). Accessed 26 Mar 2018
Pellegrini, P., Birattari, M.: The relevance of tuning the parameters of metaheuristics. A case study: the vehicle routing problem with stochastic demand. Technical report TR/IRIDIA/2006-008, IRIDIA, Universit Libre de Bruxelles, Brussels, Belgium (2006)
Pisinger, D., Ropke, S.: A general heuristic for vehicle routing problems. Comput. Oper. Res. 34(8), 2403–2435 (2007)
Rardin, R.L., Uzsoy, R.: Experimental evaluation of heuristic optimization algorithms: a tutorial. J. Heuristics 7(3), 261–304 (2001)
Rasku, J., Musliu, N., Kärkkäinen, T.: Automating the parameter selection in VRP: an off-line parameter tuning tool comparison. In: Fitzgibbon, W., Kuznetsov, Y.A., Neittaanmäki, P., Pironneau, O. (eds.) Modeling, Simulation and Optimization for Science and Technology, pp. 191–209. Springer, Berlin (2014)
Santos, H.G., Toffolo, T.A., Silva, C.L., Vanden Berghe, G.: Analysis of stochastic local search methods for the unrelated parallel machine scheduling problem. Int. Trans. Oper. Res. (2016). https://doi.org/10.1111/itor.12316
Simmons, J.P., Nelson, L.D., Simonsohn, U.: False-positive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychol. Sci. 22(11), 1359–1366 (2011)
Smith-Miles, K., Bowly, S.: Generating new test instances by evolving in instance space. Comput. Oper. Res. 63, 102–113 (2015)
Solomon, M.M.: Algorithms for the vehicle routing and scheduling problems with time window constraints. Oper. Res. 35(2), 254–265 (1987)
Stock, J., Watson, M.W.: Introduction to Econometrics. Prentice Hall, New York (2011)
Sullivan, G.M., Feinn, R.: Using effect size—or why the p value is not enough. J. Grad. Med. Educ. 4(3), 279–282 (2012)
Vallada, E., Ruiz, R.: A genetic algorithm for the unrelated parallel machine scheduling problem with sequence dependent setup times. Eur. J. Oper. Res. 211(3), 612–622 (2011)
Acknowledgements
This work is funded by COMEX (Project P7/36), a BELSPO/IAP Programme. The computational resources and services used in this work were provided by the VSC (Flemish Supercomputer Center), funded by the Research Foundation—Flanders (FWO) and the Flemish Government department EWI. The authors would like to thank Túlio Toffolo for providing us the data for the second case study.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Corstjens, J., Dang, N., Depaire, B. et al. A combined approach for analysing heuristic algorithms. J Heuristics 25, 591–628 (2019). https://doi.org/10.1007/s10732-018-9388-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10732-018-9388-7