Automatic Tuning of Rule-Based Evolutionary Machine Learning via Problem Structure Identification,IEEE Computational Intelligence Magazine

当前位置： X-MOL 学术 › IEEE Comput. Intell. Mag. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Automatic Tuning of Rule-Based Evolutionary Machine Learning via Problem Structure Identification
IEEE Computational Intelligence Magazine ( IF 10.3 ) Pub Date : 2020-08-01 , DOI: 10.1109/mci.2020.2998232
Maria A. Franco , Natalio Krasnogor , Jaume Bacardit

The success of any machine learning technique depends on the correct setting of its parameters and, when it comes to large-scale datasets, hand-tuning these parameters becomes impractical. However, very large-datasets can be pre-processed in order to distil information that could help in appropriately setting various systems parameters. In turn, this makes sophisticated machine learning methods easier to use to end-users. Thus, by modelling the performance of machine learning algorithms as a function of the structure inherent in very large datasets one could, in principle, detect "hotspots" in the parameters' space and thus, auto-tune machine learning algorithms for better dataset-specific performance. In this work we present a parameter setting mechanism for a rule-based evolutionary machine learning system that is capable of finding the adequate parameter value for a wide variety of synthetic classification problems with binary attributes and with/without added noise. Moreover, in the final validation stage our automated mechanism is able to reduce the computational time of preliminary experiments up to 71% for a challenging real-world bioinformatics dataset.

中文翻译：

通过问题结构识别自动调整基于规则的进化机器学习

任何机器学习技术的成功都取决于其参数的正确设置，当涉及到大规模数据集时，手动调整这些参数变得不切实际。但是，可以对非常大的数据集进行预处理，以提取有助于适当设置各种系统参数的信息。反过来，这使得最终用户更容易使用复杂的机器学习方法。因此，通过将机器学习算法的性能建模为超大数据集固有结构的函数，原则上可以检测参数空间中的“热点”，从而自动调整机器学习算法以获得更好的数据集特定表现。在这项工作中，我们为基于规则的进化机器学习系统提出了一种参数设置机制，该系统能够为具有二元属性和有/没有添加噪声的各种合成分类问题找到合适的参数值。此外，在最终验证阶段，对于具有挑战性的现实世界生物信息学数据集，我们的自动化机制能够将初步实验的计算时间减少多达 71%。

更新日期：2020-08-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11