当前位置: X-MOL 学术arXiv.cs.GT › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Dynamic Games of Asymmetric Information for Deceptive Autonomous Vehicles
arXiv - CS - Computer Science and Game Theory Pub Date : 2019-06-30 , DOI: arxiv-1907.00459
Linan Huang, Quanyan Zhu

This paper studies rational and persistent deception among intelligent robots to enhance the security and operation efficiency of autonomous vehicles. We present an N-person K-stage nonzero-sum game with an asymmetric information structure where each robot's private information is modeled as a random variable or its type. The deception is persistent as each robot's private type remains unknown to other robots for all stages. The deception is rational as robots aim to achieve their deception goals at minimum cost. Each robot forms a belief on others' types based on state observations and updates it via the Bayesian rule. The level-t perfect Bayesian Nash equilibrium is a natural solution concept of the dynamic game. It demonstrates the sequential rationality of the agents, maintains the belief consistency with the observations and strategies, and provides a reliable prediction of the outcome of the deception game. In particular, in the linear-quadratic setting, we derive a set of extended Riccati equations, obtain the explicit form of the affine state-feedback control, and develop an online computational algorithm. We define the concepts of deceivability and the price of deception to evaluate the strategy design and assess the deception outcome. We investigate a case study of deceptive pursuit-evasion games and use numerical experiments to corroborate the results.

中文翻译:

欺骗性自动驾驶汽车的非对称信息动态博弈

本文研究智能机器人之间的理性和持久性欺骗,以提高自动驾驶汽车的安全性和运行效率。我们提出了一个具有非对称信息结构的 N 人 K 阶段非零和博弈,其中每个机器人的私人信息被建模为随机变量或其类型。欺骗是持久的,因为每个机器人的私有类型在所有阶段对其他机器人都是未知的。欺骗是合理的,因为机器人旨在以最低成本实现其欺骗目标。每个机器人根据状态观察形成对其他人类型的信念,并通过贝叶斯规则对其进行更新。t 级完美贝叶斯纳什均衡是动态博弈的自然解概念。它证明了代理的顺序合理性,保持信念与观察和策略的一致性,并提供对欺骗游戏结果的可靠预测。特别是,在线性二次设置中,我们导出了一组扩展 Riccati 方程,获得了仿射状态反馈控制的显式形式,并开发了一种在线计算算法。我们定义了可欺骗性和欺骗价格的概念来评估策略设计和评估欺骗结果。我们调查了欺骗性追逃游戏的案例研究,并使用数值实验来证实结果。并开发在线计算算法。我们定义了可欺骗性和欺骗价格的概念来评估策略设计和评估欺骗结果。我们调查了欺骗性追逃游戏的案例研究,并使用数值实验来证实结果。并开发在线计算算法。我们定义了可欺骗性和欺骗价格的概念来评估策略设计和评估欺骗结果。我们调查了欺骗性追逃游戏的案例研究,并使用数值实验来证实结果。
更新日期:2020-03-24
down
wechat
bug