当前位置: X-MOL 学术IMA J. Manag. Math. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Heuristic mean-variance optimization in Markov decision processes using state-dependent risk aversion
IMA Journal of Management Mathematics ( IF 1.7 ) Pub Date : 2021-03-09 , DOI: 10.1093/imaman/dpab009
Rainer Schlosser 1
Affiliation  

In dynamic decision problems, it is challenging to find the right balance between maximizing expected rewards and minimizing risks. In this paper, we consider NP-hard mean-variance (MV) optimization problems in Markov decision processes with a finite time horizon. We present a heuristic approach to solve MV problems, which is based on state-dependent risk aversion and efficient dynamic programming techniques. Our approach can also be applied to mean-semivariance (MSV) problems, which particularly focus on the downside risk. We demonstrate the applicability and the effectiveness of our heuristic for dynamic pricing applications. Using reproducible examples, we show that our approach outperforms existing state-of-the-art benchmark models for MV and MSV problems while also providing competitive runtimes. Further, compared to models based on constant risk levels, we find that state-dependent risk aversion allows to more effectively intervene in case sales processes deviate from their planned paths. Our concepts are domain independent, easy to implement and of low computational complexity.

中文翻译:

马尔可夫决策过程中使用状态相关风险厌恶的启发式均方差优化

在动态决策问题中,在最大化预期回报和最小化风险之间找到正确的平衡是一项挑战。在本文中,我们考虑了有限时间范围内马尔可夫决策过程中的 NP-hard 均值方差 (MV) 优化问题。我们提出了一种启发式方法来解决 MV 问题,该方法基于状态相关的风险规避和有效的动态规划技术。我们的方法也可以应用于均值半方差(MSV)问题,特别关注下行风险。我们展示了我们的启发式方法对动态定价应用程序的适用性和有效性。使用可重复的示例,我们表明我们的方法优于现有的针对 MV 和 MSV 问题的最先进的基准模型,同时还提供了具有竞争力的运行时。进一步,与基于恒定风险水平的模型相比,我们发现依赖于状态的风险规避可以更有效地干预,以防销售流程偏离其计划路径。我们的概念是领域独立的,易于实现且计算复杂度低。
更新日期:2021-03-09
down
wechat
bug