当前位置: X-MOL 学术J. Heuristics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Dynamic heuristic acceleration of linearly approximated SARSA( $$\lambda $$ λ ): using ant colony optimization to learn heuristics dynamically
Journal of Heuristics ( IF 2.7 ) Pub Date : 2019-05-03 , DOI: 10.1007/s10732-019-09408-x
Stefano Bromuri

Heuristically accelerated reinforcement learning (HARL) is a new family of algorithms that combines the advantages of reinforcement learning (RL) with the advantages of heuristic algorithms. To achieve this, the action selection strategy of the standard RL algorithm is modified to take into account a heuristic running in parallel with the RL process. This paper presents two approximated HARL algorithms that make use of pheromone trails to improve the behaviour of linearly approximated SARSA(\(\lambda \)) by dynamically learning a heuristic function through the pheromone trails. The proposed dynamic algorithms are evaluated in comparison to linearly approximated SARSA(\(\lambda \)), and heuristically accelerated SARSA(\(\lambda \)) using a static heuristic in three benchmark scenarios: the mountain car, the mountain car 3D and the maze scenarios.

中文翻译:

线性近似SARSA($$ \ lambda $$λ)的动态启发式加速:使用蚁群优化动态学习启发式

启发式加速强化学习(HARL)是一种新的算法系列,结合了强化学习(RL)的优点和启发式算法的优点。为此,对标准RL算法的动作选择策略进行了修改,以考虑与RL过程并行运行的启发式算法。本文提出了两种近似的HARL算法,它们通过信息素轨迹动态学习启发式函数,从而利用信息素轨迹来改善线性近似SARSA(\(\ lambda \))的行为。与线性近似的SARSA(\(\ lambda \))和启发式加速的SARSA(\(\ lambda \))在以下三个基准场景中使用静态启发式方法:山地车,山地车3D和迷宫场景。
更新日期:2019-05-03
down
wechat
bug