Stochastic Dynamic Programming with Non-linear Discounting,Applied Mathematics and Optimization

当前位置： X-MOL 学术 › Appl. Math. Optim. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Stochastic Dynamic Programming with Non-linear Discounting
Applied Mathematics and Optimization ( IF 1.8 ) Pub Date : 2020-12-23 , DOI: 10.1007/s00245-020-09731-x
Nicole Bäuerle , Anna Jaśkiewicz , Andrzej S. Nowak

In this paper, we study a Markov decision process with a non-linear discount function and with a Borel state space. We define a recursive discounted utility, which resembles non-additive utility functions considered in a number of models in economics. Non-additivity here follows from non-linearity of the discount function. Our study is complementary to the work of Jaśkiewicz et al. (Math Oper Res 38:108–121, 2013), where also non-linear discounting is used in the stochastic setting, but the expectation of utilities aggregated on the space of all histories of the process is applied leading to a non-stationary dynamic programming model. Our aim is to prove that in the recursive discounted utility case the Bellman equation has a solution and there exists an optimal stationary policy for the problem in the infinite time horizon. Our approach includes two cases: (a) when the one-stage utility is bounded on both sides by a weight function multiplied by some positive and negative constants, and (b) when the one-stage utility is unbounded from below.

中文翻译：

非线性折扣的随机动态规划

在本文中，我们研究了具有非线性折扣函数和Borel状态空间的Markov决策过程。我们定义了一种递归的折扣效用，它类似于经济学中许多模型中考虑的非加性效用函数。这里的非可加性源于折扣函数的非线性。我们的研究是对Jaśkiewicz等人工作的补充。（Math Oper Res 38：108–121，2013），其中在随机环境中也使用了非线性折现，但是应用了对过程所有历史空间上聚集的效用的期望，从而导致了非平稳动态编程模型。我们的目的是证明在递归折扣效用情况下，Bellman方程具有解，并且在无限的时间范围内存在针对该问题的最优平稳策略。a）当单阶段效用在两边都由权函数乘以一些正负常数时，以及（b）当单阶段效用从下方不受限制时。

更新日期：2020-12-23

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>