Learning Adaptive Differential Evolution Algorithm From Optimization Experiences by Policy Gradient,IEEE Transactions on Evolutionary Computation

当前位置： X-MOL 学术 › IEEE T. Evolut. Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Learning Adaptive Differential Evolution Algorithm From Optimization Experiences by Policy Gradient
IEEE Transactions on Evolutionary Computation ( IF 11.7 ) Pub Date : 2021-02-19 , DOI: 10.1109/tevc.2021.3060811
Jianyong Sun , Xin Liu , Thomas Back , Zongben Xu

Differential evolution is one of the most prestigious population-based stochastic optimization algorithm for black-box problems. The performance of a differential evolution algorithm depends highly on its mutation and crossover strategy and associated control parameters. However, the determination process for the most suitable parameter setting is troublesome and time consuming. Adaptive control parameter methods that can adapt to problem landscape and optimization environment are more preferable than fixed parameter settings. This article proposes a novel adaptive parameter control approach based on learning from the optimization experiences over a set of problems. In the approach, the parameter control is modeled as a finite-horizon Markov decision process. A reinforcement learning algorithm, named policy gradient, is applied to learn an agent (i.e., parameter controller) that can provide the control parameters of a proposed differential evolution adaptively during the search procedure. The differential evolution algorithm based on the learned agent is compared against nine well-known evolutionary algorithms on the CEC’13 and CEC’17 test suites. Experimental results show that the proposed algorithm performs competitively against these compared algorithms on the test suites.

中文翻译：

通过策略梯度从优化经验中学习自适应差分进化算法

差分进化是最负盛名的基于群体的黑盒问题随机优化算法之一。差分进化算法的性能在很大程度上取决于其变异和交叉策略以及相关的控制参数。然而，确定最合适的参数设置的过程既麻烦又耗时。能够适应问题格局和优化环境的自适应控制参数方法比固定参数设置更可取。本文基于从一组问题的优化经验中学习，提出了一种新颖的自适应参数控制方法。在该方法中，参数控制被建模为有限范围马尔可夫决策过程。一种强化学习算法，名为策略梯度，被应用于学习一个代理（即参数控制器），该代理可以在搜索过程中自适应地提供所提出的差分进化的控制参数。基于学习代理的差分进化算法与 CEC'13 和 CEC'17 测试套件上的九种著名进化算法进行了比较。实验结果表明，所提出的算法在测试套件上与这些比较算法相比具有竞争力。

更新日期：2021-02-19

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11