Abstract
This paper considers the finite-horizon risk-sensitive optimality for continuous-time Markov decision processes, and focuses on the more general case that the transition rates are unbounded, cost/reward rates are allowed to be unbounded from below and from above, the policies can be history-dependent, and the state and action spaces are Borel ones. Under mild conditions imposed on the decision process’s primitive data, we establish the existence of a solution to the corresponding optimality equation (OE) by a so called approximation technique. Then, using the OE and the extension of Dynkin’s formula developed here, we prove the existence of an optimal Markov policy, and verify that the value function is the unique solution to the OE. Finally, we give an example to illustrate the difference between our conditions and those in the previous literature.
Similar content being viewed by others
References
Anantharam V, Borkar VS (2017) A variational formula for risk-sensitive reward. SIAM J Control Optim 55:961–988
Basu A, Ghosh MK (2014) Zero-sum risk-sensitive stochastic games on a countable state space. Stochastic Process Appl 124:961–983
Baüerle N, Rieder U (2014) More risk-sensitive Markov decision processes. Math Oper Res 39:105–120
Baüerle N, Rieder U (2017) Zero-sum risk-sensitive stochastic games. Stochastic Process Appl 127:622–642
Bertsekas D, Shreve S (1996) Stochastic optimal control: the discrete-time case. Academic Press, Inc
Cavazos-Cadena R, Hernndez-Hernndez D (2011) Discounted approximations for risk-sensitive average criteria in Markov decision chains with finite state space. Math Oper Res 36:133–146
Ghosh MK, Saha S (2014) Risk-sensitive control of continuous time Markov chains. Stochastics 86:655–675
Feinberg EA, Mandava M, Shiryaev AN (2014) On solutions of Kolmogorov’s equations for nonhomogeneous jump Markov processes. J Math Anal Appl 411:261–270
Guo X (2007) Continuous–time Markov decision processes with discounted rewards: the case of Polish spaces. Math Oper Res 32:73–87
Guo X, Piunovskiy A (2011) Discounted continuous-time Markov decision processes with constraints: unbounded transition and loss rates. Math Oper Res 36:105–132
Guo X, Song X (2011) Discounted continuous-time constrained Markov decision processes in Polish spaces. Ann Appl Probab 21:2016–2049
Guo X, Hernández-Lerma O (2009) Continuous-time Markov decision processes: theory and applications. Springer, Berlin
Guo X, Huang Y, Song X (2012) Linear programming and constrained average optimality for general continuous-time Markov decision processes in history-dependent policies. SIAM J Control Optim 50:23–47
Guo X, Huang XX, Zhang Y (2015a) On the first passage g-mean-variance optimality for discounted continuous-time Markov decision processes. SIAM J Control Optim 53:1406–1424
Guo X, Huang XX, Huang Y (2015b) Finite-horizon optimality for continuous-time Markov decision processes with unbounded transition rates. Adv Appl Probab 47:1064–1087
Hernández-Lerma O, Lasserre JB (1999) Further topics on discrete-time Markov control processes. Springer, New York
Huang Y (2018) Finite horizon continuous-time Markov decision processes with mean and variance criteria. Discret Event Dyn Syst 28(4):539–564
Huo H, Zou X, Guo X (2017) The risk probability criterion for discounted continuous-time Markov decision processes. Discret Event Dyn Syst 27(4):675–699
Jaskiewicz A (2007) Average optimality for risk-sensitive control with general state space. Ann Appl Probab 17:654–675
Kitaev MY, Rykov V (1995) Controlled queueing systems. CRC Press, New York
Kumar KS, Chandan P (2013) Risk-sensitive control of jump process on denumerable state space with near monotone cost. Appl Math Optim 68:311–331
Kumar KS, Chandan P (2015) Risk-sensitive control of continuous-time Markov processes with denumerable state space. Stoch Anal Appl 33:863–881
Piunovskiy A, Zhang Y (2011) Discounted continuous-time Markov decision processes with unbounded rates: the convex analytic approach. SIAM J Control Optim 49:2032–2061
Prieto-Rumeau T, Hernández-Lerma O (2012) Selected topics in continuous-time controlled Markov chains and Markov games. Imperial College Press, London
Wei QD (2016) Continuous-time Markov decision processes with risk-sensitive finite-horizon cost criterion. Math Meth Oper Res 84:461–487
Wei Q, Chen X (2017) Average cost criterion induced by the regular utility function for continuous-time Markov decision processes. Discret Event Dyn Syst 27 (3):501–524
Xia L (2014) Event-based optimization of admission control in open queueing networks. Discret Event Dyn Syst 24(2):133–151
Xia L (2018) Variance minimization of parameterized Markov decision processes. Discret Event Dyn Syst 28:63–81
Yushkevich AA (1977) Controlled Markov models with countable state and continuous time. Theory Probab Appl 22:215–235
Zhang Y (2017) Continuous-time Markov decision processes with exponential utility. SIAM J Control Optim 55:2636–2660
Acknowledgments
Research supported by the National Natural Science Foundation of China (Grant No. 60171009, Grant No. 61673019, Grant No. 11931018).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Guo, X., Zhang, J. Risk-sensitive continuous-time Markov decision processes with unbounded rates and Borel spaces. Discrete Event Dyn Syst 29, 445–471 (2019). https://doi.org/10.1007/s10626-019-00292-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10626-019-00292-y