Multi-agent machine learning in self-organizing systems,Information Sciences

当前位置： X-MOL 学术 › Inform. Sci. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Multi-agent machine learning in self-organizing systems
Information Sciences ( IF 8.1 ) Pub Date : 2021-09-08 , DOI: 10.1016/j.ins.2021.09.013
Ehsan Hejazi ₁

Affiliation

This paper develops a novel insight and procedure that includes a variety of algorithms for finding the best solution in a structured multi-agent system with internal communications and a global purpose. In other words, it finds the optimal communication structure among agents and the optimal policy in this structure. First, a unique reinforcement learning algorithm is proposed to find the optimal policy of each agent in a fixed structure with non-linear function approximators like artificial neural networks (ANN) and with eligibility traces. Secondly, a mechanism is presented to perform self-organization based on the information of the learned policy. Finally, an algorithm that can discover an appropriate inter-structure mapping and then can transfer the previous knowledge to the new structure is developed, which increases the speed of the learning in this new environment after self-organization. This paper is one of the first works that analyzes the problem fully theoretically and devises some algorithms to find the best solution. We use a simplified version of the distributed task allocation problem (DTAP) as our case study. The experimental results verify the stability of our approach and show the high speed of finding the optimal solution as a result of using transfer learning.

中文翻译：

自组织系统中的多智能体机器学习

本文开发了一种新颖的见解和程序，其中包括各种算法，用于在具有内部通信和全局目的的结构化多代理系统中寻找最佳解决方案。换句话说，它找到代理之间的最佳通信结构以及该结构中的最佳策略。首先，提出了一种独特的强化学习算法，以在具有非线性函数逼近器（如人工神经网络 (ANN)）和资格痕迹的固定结构中找到每个代理的最佳策略。其次，提出了一种基于学习策略信息进行自组织的机制。最后，开发了一种算法，可以发现适当的结构间映射，然后可以将先前的知识转移到新结构中，这提高了自组织后新环境中的学习速度。本文是最早从理论上全面分析问题并设计一些算法以找到最佳解决方案的作品之一。我们使用分布式任务分配问题 (DTAP) 的简化版本作为我们的案例研究。实验结果验证了我们方法的稳定性，并表明由于使用迁移学习，找到最优解的速度很快。

更新日期：2021-09-08

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>