Skip to main content
Log in

Risk-sensitive continuous-time Markov decision processes with unbounded rates and Borel spaces

  • Published:
Discrete Event Dynamic Systems Aims and scope Submit manuscript

Abstract

This paper considers the finite-horizon risk-sensitive optimality for continuous-time Markov decision processes, and focuses on the more general case that the transition rates are unbounded, cost/reward rates are allowed to be unbounded from below and from above, the policies can be history-dependent, and the state and action spaces are Borel ones. Under mild conditions imposed on the decision process’s primitive data, we establish the existence of a solution to the corresponding optimality equation (OE) by a so called approximation technique. Then, using the OE and the extension of Dynkin’s formula developed here, we prove the existence of an optimal Markov policy, and verify that the value function is the unique solution to the OE. Finally, we give an example to illustrate the difference between our conditions and those in the previous literature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Anantharam V, Borkar VS (2017) A variational formula for risk-sensitive reward. SIAM J Control Optim 55:961–988

    Article  MathSciNet  Google Scholar 

  • Basu A, Ghosh MK (2014) Zero-sum risk-sensitive stochastic games on a countable state space. Stochastic Process Appl 124:961–983

    Article  MathSciNet  Google Scholar 

  • Baüerle N, Rieder U (2014) More risk-sensitive Markov decision processes. Math Oper Res 39:105–120

    Article  MathSciNet  Google Scholar 

  • Baüerle N, Rieder U (2017) Zero-sum risk-sensitive stochastic games. Stochastic Process Appl 127:622–642

    Article  MathSciNet  Google Scholar 

  • Bertsekas D, Shreve S (1996) Stochastic optimal control: the discrete-time case. Academic Press, Inc

  • Cavazos-Cadena R, Hernndez-Hernndez D (2011) Discounted approximations for risk-sensitive average criteria in Markov decision chains with finite state space. Math Oper Res 36:133–146

    Article  MathSciNet  Google Scholar 

  • Ghosh MK, Saha S (2014) Risk-sensitive control of continuous time Markov chains. Stochastics 86:655–675

    Article  MathSciNet  Google Scholar 

  • Feinberg EA, Mandava M, Shiryaev AN (2014) On solutions of Kolmogorov’s equations for nonhomogeneous jump Markov processes. J Math Anal Appl 411:261–270

    Article  MathSciNet  Google Scholar 

  • Guo X (2007) Continuous–time Markov decision processes with discounted rewards: the case of Polish spaces. Math Oper Res 32:73–87

    Article  MathSciNet  Google Scholar 

  • Guo X, Piunovskiy A (2011) Discounted continuous-time Markov decision processes with constraints: unbounded transition and loss rates. Math Oper Res 36:105–132

    Article  MathSciNet  Google Scholar 

  • Guo X, Song X (2011) Discounted continuous-time constrained Markov decision processes in Polish spaces. Ann Appl Probab 21:2016–2049

    Article  MathSciNet  Google Scholar 

  • Guo X, Hernández-Lerma O (2009) Continuous-time Markov decision processes: theory and applications. Springer, Berlin

    Book  Google Scholar 

  • Guo X, Huang Y, Song X (2012) Linear programming and constrained average optimality for general continuous-time Markov decision processes in history-dependent policies. SIAM J Control Optim 50:23–47

    Article  MathSciNet  Google Scholar 

  • Guo X, Huang XX, Zhang Y (2015a) On the first passage g-mean-variance optimality for discounted continuous-time Markov decision processes. SIAM J Control Optim 53:1406–1424

    Article  MathSciNet  Google Scholar 

  • Guo X, Huang XX, Huang Y (2015b) Finite-horizon optimality for continuous-time Markov decision processes with unbounded transition rates. Adv Appl Probab 47:1064–1087

    Article  MathSciNet  Google Scholar 

  • Hernández-Lerma O, Lasserre JB (1999) Further topics on discrete-time Markov control processes. Springer, New York

    Book  Google Scholar 

  • Huang Y (2018) Finite horizon continuous-time Markov decision processes with mean and variance criteria. Discret Event Dyn Syst 28(4):539–564

    Article  MathSciNet  Google Scholar 

  • Huo H, Zou X, Guo X (2017) The risk probability criterion for discounted continuous-time Markov decision processes. Discret Event Dyn Syst 27(4):675–699

    Article  MathSciNet  Google Scholar 

  • Jaskiewicz A (2007) Average optimality for risk-sensitive control with general state space. Ann Appl Probab 17:654–675

    Article  MathSciNet  Google Scholar 

  • Kitaev MY, Rykov V (1995) Controlled queueing systems. CRC Press, New York

    MATH  Google Scholar 

  • Kumar KS, Chandan P (2013) Risk-sensitive control of jump process on denumerable state space with near monotone cost. Appl Math Optim 68:311–331

    Article  MathSciNet  Google Scholar 

  • Kumar KS, Chandan P (2015) Risk-sensitive control of continuous-time Markov processes with denumerable state space. Stoch Anal Appl 33:863–881

    Article  MathSciNet  Google Scholar 

  • Piunovskiy A, Zhang Y (2011) Discounted continuous-time Markov decision processes with unbounded rates: the convex analytic approach. SIAM J Control Optim 49:2032–2061

    Article  MathSciNet  Google Scholar 

  • Prieto-Rumeau T, Hernández-Lerma O (2012) Selected topics in continuous-time controlled Markov chains and Markov games. Imperial College Press, London

    Book  Google Scholar 

  • Wei QD (2016) Continuous-time Markov decision processes with risk-sensitive finite-horizon cost criterion. Math Meth Oper Res 84:461–487

    Article  MathSciNet  Google Scholar 

  • Wei Q, Chen X (2017) Average cost criterion induced by the regular utility function for continuous-time Markov decision processes. Discret Event Dyn Syst 27 (3):501–524

    Article  MathSciNet  Google Scholar 

  • Xia L (2014) Event-based optimization of admission control in open queueing networks. Discret Event Dyn Syst 24(2):133–151

    Article  MathSciNet  Google Scholar 

  • Xia L (2018) Variance minimization of parameterized Markov decision processes. Discret Event Dyn Syst 28:63–81

    Article  MathSciNet  Google Scholar 

  • Yushkevich AA (1977) Controlled Markov models with countable state and continuous time. Theory Probab Appl 22:215–235

    Article  MathSciNet  Google Scholar 

  • Zhang Y (2017) Continuous-time Markov decision processes with exponential utility. SIAM J Control Optim 55:2636–2660

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

Research supported by the National Natural Science Foundation of China (Grant No. 60171009, Grant No. 61673019, Grant No. 11931018).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Junyu Zhang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guo, X., Zhang, J. Risk-sensitive continuous-time Markov decision processes with unbounded rates and Borel spaces. Discrete Event Dyn Syst 29, 445–471 (2019). https://doi.org/10.1007/s10626-019-00292-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10626-019-00292-y

Keywords

Navigation