A Multiple Attribute Decision-Making Approach to Reinforcement Learning,IEEE Transactions on Cognitive and Developmental Systems

当前位置： X-MOL 学术 › IEEE Trans. Cogn. Dev. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Multiple Attribute Decision-Making Approach to Reinforcement Learning
IEEE Transactions on Cognitive and Developmental Systems ( IF 5.0 ) Pub Date : 2020-12-01 , DOI: 10.1109/tcds.2019.2924724
Haobin Shi , Meng Xu

In the reinforcement learning (RL) system, one important issue is the tradeoff problem between exploration and exploitation. In this paper, we studied this dilemma and proposed a new approach to solving this problem by multiple-attribute decision making (MADM). The applicability of the proposed method is extended by transfer learning. The method decomposes a task into several subtasks and uses the policies of subtasks trained by RL. The proposed visual MADM method (V-MADM) is based on the state-action values of each subtask to select the action with maximal one. Meanwhile, this paper proposes a transfer learning method using a decay function with decreasing probability such that the prior experiences of the subtasks can be utilized to accelerate the learning rate. Finally, the experiment of robot confrontation and Maze walker is performed to evaluate the learning performance of the proposed method. The experimental results show that fewer training cost is needed to obtain a more effective learning performance.

中文翻译：

强化学习的多属性决策方法

在强化学习 (RL) 系统中，一个重要的问题是探索和开发之间的权衡问题。在本文中，我们研究了这个困境，并提出了一种通过多属性决策（MADM）解决这个问题的新方法。通过迁移学习扩展了所提出方法的适用性。该方法将一个任务分解为几个子任务，并使用 RL 训练的子任务的策略。所提出的视觉 MADM 方法 (V-MADM) 基于每个子任务的状态-动作值来选择最大的动作。同时，本文提出了一种使用概率递减的衰减函数的迁移学习方法，以便可以利用子任务的先验经验来加快学习速度。最后，进行了机器人对抗和迷宫步行者的实验，以评估所提出方法的学习性能。实验结果表明，需要更少的训练成本来获得更有效的学习性能。

更新日期：2020-12-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11