当前位置: X-MOL 学术Automatica › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Bounded rational Dubins vehicle coordination for target tracking using reinforcement learning
Automatica ( IF 4.8 ) Pub Date : 2023-01-03 , DOI: 10.1016/j.automatica.2022.110732
Nick-Marios T. Kokolakis , Kyriakos G. Vamvoudakis

In this paper, we address the problem of cooperative tracking of multiple heterogeneous targets by deploying multiple and heterogeneous pursuers exhibiting different decision-making capabilities. Initially, under infinite resources, we formulate a game between the evader and the pursuing team, with an evader being the maximizing player and the pursuing team being the minimizing one. Subsequently, we relax the perfect rationality assumption via the use of a level-k thinking framework that allows the evaders to not exhibit the same levels of rationality. Such rationality policies are computed by using a reinforcement learning-based architecture and are proven to form Nash policies as the thinking levels increase. Finally, in the case of multiple pursuers against multiple targets, we develop a switched learning scheme with multiple convergence sets by assigning the most intelligent pursuers to the most intelligent evaders.



中文翻译:

使用强化学习进行目标跟踪的有界理性 Dubins 车辆协调

在本文中,我们通过部署具有不同决策能力的多个异构追踪器来解决多个异构目标的协同跟踪问题。最初,在无限资源下,我们在逃避者和追击者之间制定了一个博弈,逃避者是最大化玩家,追击者是最小化者。随后,我们通过使用水平 -k允许逃避者不表现出相同水平的理性的思维框架。这种理性策略是通过使用基于强化学习的架构来计算的,并被证明可以随着思维水平的提高而形成纳什策略。最后,在针对多个目标的多个追击者的情况下,我们通过将最聪明的追击者分配给最聪明的逃避者来开发具有多个收敛集的切换学习方案。

更新日期:2023-01-03
down
wechat
bug