Multi-agent Reinforcement Learning Improvement in a Dynamic Environment Using Knowledge Transfer,arXiv - CS - Multiagent Systems

当前位置： X-MOL 学术 › arXiv.cs.MA › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Multi-agent Reinforcement Learning Improvement in a Dynamic Environment Using Knowledge Transfer
arXiv - CS - Multiagent Systems Pub Date : 2021-07-20 , DOI: arxiv-2107.09807
Mahnoosh Mahdavimoghaddama, Amin Nikanjama, Monireh Abdoos

Cooperative multi-agent systems are being widely used in different domains. Interaction among agents would bring benefits, including reducing operating costs, high scalability, and facilitating parallel processing. These systems are also a good option for handling large-scale, unknown, and dynamic environments. However, learning in these environments has become a very important challenge in various applications. These challenges include the effect of search space size on learning time, inefficient cooperation among agents, and the lack of proper coordination among agents' decisions. Moreover, reinforcement learning algorithms may suffer from long convergence time in these problems. In this paper, a communication framework using knowledge transfer concepts is introduced to address such challenges in the herding problem with large state space. To handle the problems of convergence, knowledge transfer has been utilized that can significantly increase the efficiency of reinforcement learning algorithms. Coordination between the agents is carried out through a head agent in each group of agents and a coordinator agent respectively. The results demonstrate that this framework could indeed enhance the speed of learning and reduce convergence time.

中文翻译：

使用知识转移在动态环境中改进多智能体强化学习

协作多智能体系统正被广泛应用于不同领域。代理之间的交互将带来好处，包括降低运营成本、高可扩展性和促进并行处理。这些系统也是处理大规模、未知和动态环境的不错选择。然而，在这些环境中学习已经成为各种应用中非常重要的挑战。这些挑战包括搜索空间大小对学习时间的影响、代理之间的低效合作以及代理决策之间缺乏适当的协调。此外，强化学习算法在这些问题中可能会遇到较长的收敛时间。在本文中，引入了使用知识转移概念的通信框架来解决具有大状态空间的羊群问题中的此类挑战。为了解决收敛问题，知识转移已经被利用，可以显着提高强化学习算法的效率。代理之间的协调分别通过每组代理中的头部代理和协调代理进行。结果表明，该框架确实可以提高学习速度并减少收敛时间。

更新日期：2021-07-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>