Learning feature spaces for regression with genetic programming,Genetic Programming and Evolvable Machines

当前位置： X-MOL 学术 › Genet. Program. Evolvable Mach. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Learning feature spaces for regression with genetic programming
Genetic Programming and Evolvable Machines ( IF 1.7 ) Pub Date : 2020-03-11 , DOI: 10.1007/s10710-020-09383-4
William La Cava ₁ , Jason H Moore ₁

Affiliation

Genetic programming has found recent success as a tool for learning sets of features for regression and classification. Multidimensional genetic programming is a useful variant of genetic programming for this task because it represents candidate solutions as sets of programs. These sets of programs expose additional information that can be exploited for building block identification. In this work, we discuss this architecture and others in terms of their propensity for allowing heuristic search to utilize information during the evolutionary process. We investigate methods for biasing the components of programs that are promoted in order to guide search towards useful and complementary feature spaces. We study two main approaches: (1) the introduction of new objectives and (2) the use of specialized semantic variation operators. We find that a semantic crossover operator based on stagewise regression leads to significant improvements on a set of regression problems. The inclusion of semantic crossover produces state-of-the-art results in a large benchmark study of open-source regression problems in comparison to several state-of-the-art machine learning approaches and other genetic programming frameworks. Finally, we look at the collinearity and complexity of the data representations produced by different methods, in order to assess whether relevant, concise, and independent factors of variation can be produced in application.

中文翻译：

通过遗传编程学习回归的特征空间

遗传编程作为学习回归和分类特征集的工具，最近取得了成功。对于此任务，多维遗传编程是遗传编程的一个有用变体，因为它将候选解决方案表示为程序集。这些程序集公开了可用于构建块识别的附加信息。在这项工作中，我们讨论了这种架构和其他架构在进化过程中允许启发式搜索利用信息的倾向。我们研究了对所推广的程序组件进行偏置的方法，以引导搜索到有用且互补的特征空间。我们研究两种主要方法：（1）引入新目标和（2）使用专门的语义变化运算符。我们发现基于阶段回归的语义交叉算子可以显着改进一组回归问题。与几种最先进的机器学习方法和其他遗传编程框架相比，语义交叉的包含在开源回归问题的大型基准研究中产生了最先进的结果。最后，我们考察不同方法产生的数据表示的共线性和复杂性，以评估在应用中是否可以产生相关的、简洁的、独立的变异因素。

更新日期：2020-03-11

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11