当前位置: X-MOL 学术Cogn. Syst. Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Learning Data-Driven Decision-Making Policies in Multi-Agent Environments for Autonomous Systems
Cognitive Systems Research ( IF 2.1 ) Pub Date : 2021-01-01 , DOI: 10.1016/j.cogsys.2020.09.006
Joosep Hook , Seif El-Sedky , Varuna De Silva , Ahmet Kondoz

Abstract Autonomous systems such as Connected Autonomous Vehicles (CAVs), assistive robots are set improve the way we live. Autonomous systems need to be equipped with capabilities to Reinforcement Learning (RL) is a type of machine learning where an agent learns by interacting with its environment through trial and error, which has gained significant interest from research community for its promise to efficiently learn decision making through abstraction of experiences. However, most of the control algorithms used today in current autonomous systems such as driverless vehicle prototypes or mobile robots are controlled through supervised learning methods or manually designed rule-based policies. Additionally, many emerging autonomous systems such as driverless cars, are set in a multi-agent environment, often with partial observability. Learning decision making policies in multi-agent environments is a challenging problem, because the environment is not stationary from the perspective of a learning agent, and hence the Markov properties assumed in single agent RL does not hold. This paper focuses on learning decision-making policies in multi-agent environments, both in cooperative settings with full observability and dynamic environments with partial observability. We present experiments in simple, yet effective, new multi-agent environments to simulate policy learning in scenarios that could be encountered by an autonomous navigating agent such as a CAV. The results illustrate how agents learn to cooperate in order to achieve their objectives successfully. Also, it was shown that in a partially observable setting, an agent was capable of learning to roam around its environment without colliding in the presence of obstacles and other moving agents. Finally, the paper discusses how data-driven multi-agent policy learning can be extended to real-world environments by augmenting the intelligence of autonomous vehicles.

中文翻译:

在自治系统的多代理环境中学习数据驱动的决策策略

摘要 自动驾驶系统,例如联网自动驾驶汽车 (CAV)、辅助机器人,可以改善我们的生活方式。自治系统需要配备强化学习 (RL) 的能力,这是一种机器学习类型,其中代理通过反复试验与环境交互来学习,这已引起研究界的极大兴趣,因为它承诺有效地学习决策通过经验的抽象。然而,当今在当前自主系统中使用的大多数控制算法,例如无人驾驶汽车原型或移动机器人,都是通过监督学习方法或手动设计的基于规则的策略来控制的。此外,许多新兴的自主系统(例如无人驾驶汽车)设置在多代理环境中,通常具有部分可观察性。在多智能体环境中学习决策制定策略是一个具有挑战性的问题,因为从学习智能体的角度来看环境不是静止的,因此在单个智能体 RL 中假设的马尔可夫特性不成立。本文侧重于学习多智能体环境中的决策策略,无论是在具有完全可观察性的合作环境中,还是在具有部分可观察性的动态环境中。我们在简单而有效的新多代理环境中进行实验,以模拟自动导航代理(如 CAV)可能遇到的场景中的策略学习。结果说明了代理如何学习合作以成功实现其目标。此外,研究表明,在部分可观察的环境中,代理能够学习在其环境中漫游而不会在存在障碍物和其他移动代理的情况下发生碰撞。最后,本文讨论了如何通过增强自动驾驶汽车的智能将数据驱动的多智能体策略学习扩展到现实世界环境。
更新日期:2021-01-01
down
wechat
bug