当前位置: X-MOL 学术arXiv.cs.LO › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Life is Random, Time is Not: Markov Decision Processes with Window Objectives
arXiv - CS - Logic in Computer Science Pub Date : 2019-01-11 , DOI: arxiv-1901.03571
Thomas Brihaye, Florent Delgrange, Youssouf Oualhadj, and Mickael Randour

The window mechanism was introduced by Chatterjee et al. to strengthen classical game objectives with time bounds. It permits to synthesize system controllers that exhibit acceptable behaviors within a configurable time frame, all along their infinite execution, in contrast to the traditional objectives that only require correctness of behaviors in the limit. The window concept has proved its interest in a variety of two-player zero-sum games because it enables reasoning about such time bounds in system specifications, but also thanks to the increased tractability that it usually yields. In this work, we extend the window framework to stochastic environments by considering Markov decision processes. A fundamental problem in this context is the threshold probability problem: given an objective it aims to synthesize strategies that guarantee satisfying runs with a given probability. We solve it for the usual variants of window objectives, where either the time frame is set as a parameter, or we ask if such a time frame exists. We develop a generic approach for window-based objectives and instantiate it for the classical mean-payoff and parity objectives, already considered in games. Our work paves the way to a wide use of the window mechanism in stochastic models.

中文翻译:

生活是随机的,时间不是:具有窗口目标的马尔可夫决策过程

窗口机制是由 Chatterjee 等人提出的。加强有时间限制的经典游戏目标。它允许综合系统控制器,在可配置的时间范围内表现出可接受的行为,一直执行它们的无限执行,这与仅要求在极限内行为正确性的传统目标形成对比。窗口概念已经证明了它对各种两人零和游戏的兴趣,因为它可以对系统规范中的这种时间界限进行推理,但也归功于它通常产生的增加的易处理性。在这项工作中,我们通过考虑马尔可夫决策过程将窗口框架扩展到随机环境。这种情况下的一个基本问题是阈值概率问题:给定一个目标,它旨在综合策略,保证以给定的概率满足运行。我们针对窗口目标的常见变体解决它,其中将时间范围设置为参数,或者询问是否存在这样的时间范围。我们为基于窗口的目标开发了一种通用方法,并将其实例化为经典的平均收益和奇偶目标,这在游戏中已经考虑过了。我们的工作为在随机模型中广泛使用窗口机制铺平了道路。
更新日期:2020-04-03
down
wechat
bug