当前位置: X-MOL 学术Genet. Program. Evolvable Mach. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Benchmarking state-of-the-art symbolic regression algorithms
Genetic Programming and Evolvable Machines ( IF 2.6 ) Pub Date : 2020-03-24 , DOI: 10.1007/s10710-020-09387-0
Jan Žegklitz , Petr Pošík

Symbolic regression (SR) is a powerful method for building predictive models from data without assuming any model structure. Traditionally, genetic programming (GP) was used as the SR engine. However, for these purely evolutionary methods it was quite hard to even accommodate the function to the range of the data and the training was consequently inefficient and slow. Recently, several SR algorithms emerged which employ multiple linear regression. This allows the algorithms to create models with relatively small error right from the beginning of the search. Such algorithms are claimed to be by orders of magnitude faster than SR algorithms based on classic GP. However, a systematic comparison of these algorithms on a common set of problems is still missing and there is no basis on which to decide which algorithm to use. In this paper we conceptually and experimentally compare several representatives of such algorithms: GPTIPS, FFX, and EFS. We also include GSGP-Red, which is an enhanced version of geometric semantic genetic programming, an important algorithm in the field of SR. They are applied as off-the-shelf, ready-to-use techniques, mostly using their default settings. The methods are compared on several synthetic SR benchmark problems as well as real-world ones ranging from civil engineering to aerodynamics and acoustics. Their performance is also related to the performance of three conventional machine learning algorithms: multiple regression, random forests and support vector regression. The results suggest that across all the problems, the algorithms have comparable performance. We provide basic recommendations to the user regarding the choice of the algorithm.

中文翻译:

对最先进的符号回归算法进行基准测试

符号回归 (SR) 是一种无需假设任何模型结构即可从数据构建预测模型的强大方法。传统上,遗传编程 (GP) 被用作 SR 引擎。然而,对于这些纯粹的进化方法,甚至很难将函数适应数据范围,因此训练效率低下且速度缓慢。最近,出现了几种采用多元线性回归的 SR 算法。这允许算法从搜索开始就创建具有相对较小误差的模型。据称,此类算法比基于经典 GP 的 SR 算法快几个数量级。但是,仍然缺少对这些算法在一组常见问题上的系统比较,并且没有决定使用哪种算法的基础。在本文中,我们从概念上和实验上比较了此类算法的几种代表:GPTIPS、FFX 和 EFS。我们还包括 GSGP-Red,它是几何语义遗传规划的增强版本,是 SR 领域的重要算法。它们被用作现成的、随时可用的技术,主要使用它们的默认设置。这些方法在几个合成 SR 基准问题以及从土木工程到空气动力学和声学的现实世界中进行了比较。它们的性能还与三种常规机器学习算法的性能有关:多元回归、随机森林和支持向量回归。结果表明,在所有问题中,算法具有可比的性能。
更新日期:2020-03-24
down
wechat
bug