当前位置: X-MOL 学术Eng. Appl. Artif. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A nucleus for Bayesian Partially Observable Markov Games: Joint observer and mechanism design
Engineering Applications of Artificial Intelligence ( IF 8 ) Pub Date : 2020-08-10 , DOI: 10.1016/j.engappai.2020.103876
Julio B. Clempner , Alexander S. Poznyak

An intelligent agent suggests an autonomous entity, which manages and learns actions to be taken towards achieving goals. The problem, reported as common knowledge in the literature in Artificial Intelligence (AI), is that it is a challenge to develop an approach able to compute efficient decisions that maximize the total reward of interacting agents upon an environment with unknown, incomplete, and uncertain information.

To address these shortcomings, this paper provides a step forward: a nucleus for Bayesian Partially Observable Markov Games (BPOMGs) supported by an AI approach. Three fundamental topics conform the structure of the nucleus: game theory, learning and inference. First, we present a novel general Bayesian approach which is conceptualized for games that considered both, the incomplete information of the Bayesian model and the incomplete information over the states of the Markov system. In this new model, execution uncertainty is handled by using a Partially Observable Markov Game (POMG). Second, we extend the design theory, which now involves the mechanism design and the joint observer design (both unknown). The mechanism design results from the fact that agents act in their own individuals’ self-interest, and to induce agents to not reveal their private information and create a particular outcome. The joint observer design goal is related to represent the fact that agents may not be interested to provide accurate information of their states. In addition, agents follow a model that employs a Reinforcement Learning (RL) approach for estimating the transition matrices (also unknown) at each time step. Hence, as our final contribution, is an extended model of POMGs by introducing a new variable and proposing an analytical solution to compute both the observer design and the mechanism design (the two unknown). The proposed extension makes the game theory problem computationally tractable. We derive relations to recover analytically the variables of interest for each agent, i.e. observation kernels, joint observers, mechanism, strategies, and distribution vectors. The usefulness and effectiveness of the proposed nucleus is validated in simulation on a game-theoretic analysis of the patrolling problem designing the mechanism, computing the observers, and employing an RL approach.



中文翻译:

贝叶斯部分可观察的马尔可夫博弈核:联合观察者和机制设计

智能代理建议一个自治实体,该实体管理和学习要实现目标所要采取的行动。作为人工智能(AI)文献中的常识报道的问题是,开发一种能够计算有效决策的方法是一个挑战,该方法可以在未知,不完整和不确定的环境中最大化相互作用因子的总回报信息。

为了解决这些缺点,本文提供了一个前进的步骤:一个由AI方法支持的贝叶斯部分可观察的马尔可夫游戏(BPOMG)的核心。三个基本主题符合核的结构:博弈论,学习和推理。首先,我们提出一种新颖的通用贝叶斯方法,该方法被概念化用于考虑了贝叶斯模型的不完全信息和马尔可夫系统状态的不完全信息的游戏。在此新模型中,执行不确定性是通过使用部分可观察的马尔可夫博弈(POMG)处理的。其次,我们扩展了设计理论,该理论现在涉及机制设计和联合观察者设计(均为未知)。机制的设计源于以下事实:代理人为自己的个人利益行事,并诱使代理商不要泄露自己的私人信息并创造特定的结果。联合观察者设计目标与代表以下事实有关:代理可能不希望提供其状态的准确信息。此外,代理商遵循采用以下模式的模型:强化学习(RL)方法,用于估计每个时间步的过渡矩阵(也是未知的)。因此,作为我们的最后贡献,是通过引入新变量并提出分析解决方案来计算观察者设计和机构设计(这两个未知数)的POMG扩展模型。拟议的扩展使博弈论问题在计算上易于处理。我们导出关系以分析地恢复每个代理的目标变量,即观察核,联合观察者,机制,策略和分布向量。在对巡逻问题进行博弈论分析,设计机构,计算观察者并采用RL方法的仿真中,验证了所提出核的有效性和有效性。

更新日期:2020-08-10
down
wechat
bug