FAOD: Fast Automatic Option Discovery in Hierarchical Reinforcement Learning,International Journal on Artificial Intelligence Tools

当前位置： X-MOL 学术 › Int. J. Artif. Intell. Tools › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

FAOD: Fast Automatic Option Discovery in Hierarchical Reinforcement Learning
International Journal on Artificial Intelligence Tools ( IF 1.0 ) Pub Date : 2021-03-26 , DOI: 10.1142/s0218213021500068
Zoulikha Koudad _{1,

2}

Affiliation

The hierarchical reinforcement learning framework breaks down the reinforcement learning problem into subtasks or extended actions called options in order to facilitate its resolution. Different models have been proposed where options were manually predefined or semi-automatically discovered. However, the automatic discovery of options has become a real challenge for research in hierarchical reinforcement learning and the new proposed approaches are very greedy in learning time or space. Thus we opt for a faster and less consuming approach. In this paper we propose an automatic option discovery method for hierarchical reinforcement learning, that we call FAOD (Fast Automatic Option Discovery). We take inspiration from robot learning methods to categorize the sensorimotor flow during navigation. Here, the agent moves along the walls to discover the rooms’ contour, closed spaces, doors and bottleneck regions to define terminate states for options. The FAOD method is evaluated on different classical maze problems, demonstrating fast and promising results.

中文翻译：

FAOD：分层强化学习中的快速自动选项发现

分层强化学习框架将强化学习问题分解为子任务或称为选项的扩展动作，以促进其解决。已经提出了不同的模型，其中选项是手动预定义或半自动发现的。然而，选项的自动发现已成为分层强化学习研究的真正挑战，并且新提出的方法在学习时间或空间上非常贪婪。因此，我们选择了一种更快且消耗更少的方法。在本文中，我们提出了一种用于分层强化学习的自动选项发现方法，我们称之为FAOD（快速自动选项发现）。我们从机器人学习方法中汲取灵感，对导航过程中的感觉运动流进行分类。这里，代理沿着墙壁移动以发现房间的轮廓、封闭空间、门和瓶颈区域，以定义选项的终止状态。在不同的经典迷宫问题上评估了FAOD 方法，展示了快速且有希望的结果。

更新日期：2021-03-26

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11