Experience Sharing Based Memetic Transfer Learning for Multiagent Reinforcement Learning,Memetic Computing

当前位置： X-MOL 学术 › Memetic Comp. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Experience Sharing Based Memetic Transfer Learning for Multiagent Reinforcement Learning
Memetic Computing ( IF 3.3 ) Pub Date : 2021-07-02 , DOI: 10.1007/s12293-021-00339-4
Tonghao Wang ₁ , Xingguang Peng ₁ , Demin Xu ₁ , Yaochu Jin ₂

Affiliation

In transfer learning (TL) for multiagent reinforcement learning (MARL), most popular methods are based on action advising scheme, in which skilled agents directly transfer actions, i.e., explicit knowledge, to other agents. However, this scheme requires an inquiry-answer process, which quadratically increases the computational load as the number of agents increases. To enhance the scalability of TL for MARL when all the agents learn from scratch, we propose an experience sharing based memetic TL for MARL, called MeTL-ES. In the MeTL-ES, the agents actively share implicit memetic knowledge (experience), which avoids the inquiry-answer process and brings highly scalable and effective acceleration of learning. In particular, we firstly design an experience sharing scheme to share implicit meme based experience among the agents. Within this scheme, experience from the peers is collected and used to speed up the learning process. More importantly, this scheme frees the agents from actively asking for the states and policies of other agents, which enhances scalability. Secondly, an event-triggered scheme is designed to enable the agents to share the experiences at appropriate timings. Simulation studies show that, compared with the existing methods, the proposed MeTL-ES can more effectively enhance the learning speed of learning-from-scratch MARL systems. At the same time, we show that the communication cost and computational load of MeTL-ES increase linearly with the growth of the number of agents, indicating better scalability compared to the popular action advising based methods.

中文翻译：

基于经验共享的多智能体强化学习模因迁移学习

在用于多智能体强化学习 (MARL) 的迁移学习 (TL) 中，最流行的方法是基于动作建议方案，其中熟练的智能体直接将动作，即显性知识，传递给其他智能体。然而，该方案需要一个询问-回答过程，随着代理数量的增加，计算负载呈二次方增加。当所有代理从头开始学习时，为了增强 MARL TL 的可扩展性，我们提出了一种基于经验共享的 MARL 模因 TL，称为 MeTL-ES。在 MeTL-ES 中，代理主动共享隐含的模因知识（经验），避免了询问-回答过程，带来了高度可扩展和有效的学习加速。特别是，我们首先设计了一个经验共享方案，以在代理之间共享基于隐性模因的经验。在这个方案中，来自同龄人的经验被收集并用于加速学习过程。更重要的是，该方案使代理无需主动询问其他代理的状态和策略，从而增强了可扩展性。其次，事件触发方案旨在使代理能够在适当的时间分享经验。仿真研究表明，与现有方法相比，所提出的 MeTL-ES 可以更有效地提高从头开始学习 MARL 系统的学习速度。同时，我们表明 MeTL-ES 的通信成本和计算负载随着代理数量的增长线性增加，表明与流行的基于动作建议的方法相比具有更好的可扩展性。该方案使代理免于主动询问其他代理的状态和策略，从而增强了可扩展性。其次，事件触发方案旨在使代理能够在适当的时间分享经验。仿真研究表明，与现有方法相比，所提出的 MeTL-ES 可以更有效地提高从头开始学习 MARL 系统的学习速度。同时，我们表明 MeTL-ES 的通信成本和计算负载随着代理数量的增长线性增加，表明与流行的基于动作建议的方法相比具有更好的可扩展性。该方案使代理免于主动询问其他代理的状态和策略，从而增强了可扩展性。其次，事件触发方案旨在使代理能够在适当的时间分享经验。仿真研究表明，与现有方法相比，所提出的 MeTL-ES 可以更有效地提高从头开始学习 MARL 系统的学习速度。同时，我们表明 MeTL-ES 的通信成本和计算负载随着代理数量的增长线性增加，表明与流行的基于动作建议的方法相比具有更好的可扩展性。事件触发方案旨在使代理能够在适当的时间分享经验。仿真研究表明，与现有方法相比，所提出的 MeTL-ES 可以更有效地提高从头开始学习 MARL 系统的学习速度。同时，我们表明 MeTL-ES 的通信成本和计算负载随着代理数量的增长线性增加，表明与流行的基于动作建议的方法相比具有更好的可扩展性。事件触发方案旨在使代理能够在适当的时间分享经验。仿真研究表明，与现有方法相比，所提出的 MeTL-ES 可以更有效地提高从头开始学习 MARL 系统的学习速度。同时，我们表明 MeTL-ES 的通信成本和计算负载随着代理数量的增长线性增加，表明与流行的基于动作建议的方法相比具有更好的可扩展性。

更新日期：2021-07-02

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11