当前位置: X-MOL 学术Discrete Event Dyn. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Risk-sensitive continuous-time Markov decision processes with unbounded rates and Borel spaces
Discrete Event Dynamic Systems ( IF 1.4 ) Pub Date : 2019-10-19 , DOI: 10.1007/s10626-019-00292-y
Xianping Guo , Junyu Zhang

This paper considers the finite-horizon risk-sensitive optimality for continuous-time Markov decision processes, and focuses on the more general case that the transition rates are unbounded, cost/reward rates are allowed to be unbounded from below and from above, the policies can be history-dependent, and the state and action spaces are Borel ones. Under mild conditions imposed on the decision process’s primitive data, we establish the existence of a solution to the corresponding optimality equation (OE) by a so called approximation technique. Then, using the OE and the extension of Dynkin’s formula developed here, we prove the existence of an optimal Markov policy, and verify that the value function is the unique solution to the OE. Finally, we give an example to illustrate the difference between our conditions and those in the previous literature.

中文翻译:

具有无界利率和 Borel 空间的风险敏感连续时间马尔可夫决策过程

本文考虑了连续时间马尔可夫决策过程的有限范围风险敏感最优性,并着重于更一般的情况,即转移率无界,成本/报酬率允许从下到上无界,策略可以是历史相关的,状态和动作空间是 Borel 的。在对决策过程的原始数据施加温和条件下,我们通过所谓的近似技术建立相应最优方程 (OE) 的解的存在性。然后,使用 OE 和这里开发的 Dynkin 公式的扩展,我们证明了最优马尔可夫策略的存在,并验证了价值函数是 OE 的唯一解。最后,
更新日期:2019-10-19
down
wechat
bug