当前位置: X-MOL 学术arXiv.cs.SY › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Robust Finite-State Controllers for Uncertain POMDPs
arXiv - CS - Systems and Control Pub Date : 2020-09-24 , DOI: arxiv-2009.11459
Murat Cubuktepe, Nils Jansen, Sebastian Junges, Ahmadreza Marandi, Marnix Suilen, Ufuk Topcu

Uncertain partially observable Markov decision processes (uPOMDPs) allow the probabilistic transition and observation functions of standard POMDPs to belong to a so-called uncertainty set. Such uncertainty sets capture uncountable sets of probability distributions. We develop an algorithm to compute finite-memory policies for uPOMDPs that robustly satisfy given specifications against any admissible distribution. In general, computing such policies is both theoretically and practically intractable. We provide an efficient solution to this problem in four steps. (1) We state the underlying problem as a nonconvex optimization problem with infinitely many constraints. (2) A dedicated dualization scheme yields a dual problem that is still nonconvex but has finitely many constraints. (3) We linearize this dual problem and (4) solve the resulting finite linear program to obtain locally optimal solutions to the original problem. The resulting problem formulation is exponentially smaller than those resulting from existing methods. We demonstrate the applicability of our algorithm using large instances of an aircraft collision-avoidance scenario and a novel spacecraft motion planning case study.

中文翻译:

用于不确定 POMDP 的鲁棒有限状态控制器

不确定的部分可观察马尔可夫决策过程 (uPOMDP) 允许标准 POMDP 的概率转换和观察函数属于所谓的不确定集。这种不确定性集合捕获了不可数的概率分布集合。我们开发了一种算法来计算 uPOMDP 的有限内存策略,该策略稳健地满足针对任何可接受分布的给定规范。一般而言,计算此类策略在理论上和实践上都是难以处理的。我们通过四个步骤为这个问题提供了一个有效的解决方案。(1) 我们将潜在问题描述为具有无限多约束的非凸优化问题。(2) 专用的二元化方案产生了一个仍然是非凸的但具有有限多个约束的对偶问题。(3) 我们将这个对偶问题线性化,以及 (4) 求解由此产生的有限线性规划以获得原始问题的局部最优解。由此产生的问题公式比现有方法产生的问题公式小得多。我们使用飞机避碰场景的大型实例和新的航天器运动规划案例研究来证明我们算法的适用性。
更新日期:2020-09-25
down
wechat
bug