Inverse Risk-Sensitive Reinforcement Learning,IEEE Transactions on Automatic Control

当前位置： X-MOL 学术 › IEEE Trans. Autom. Control › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Inverse Risk-Sensitive Reinforcement Learning
IEEE Transactions on Automatic Control ( IF 6.8 ) Pub Date : 2020-03-01 , DOI: 10.1109/tac.2019.2926674
Lillian J. Ratliff , Eric Mazumdar

This work addresses the problem of inverse reinforcement learning in Markov decision processes where the decision-making agent is risk-sensitive. In particular, a risk-sensitive reinforcement learning algorithm with convergence guarantees that makes use of coherent risk metrics and models of human decision-making which have their origins in behavioral psychology and economics is presented. The risk-sensitive reinforcement learning algorithm provides the theoretical underpinning for a gradient-based inverse reinforcement learning algorithm that seeks to minimize a loss function defined on the observed behavior. It is shown that the gradient of the loss function with respect to the model parameters is well defined and computable via a contraction map argument. Evaluation of the proposed technique is performed on a Grid World example, a canonical benchmark problem.

中文翻译：

逆风险敏感强化学习

这项工作解决了马尔可夫决策过程中的逆向强化学习问题，其中决策代理对风险敏感。特别是，提出了一种具有收敛保证的风险敏感强化学习算法，该算法利用了源于行为心理学和经济学的连贯风险度量和人类决策模型。风险敏感强化学习算法为基于梯度的逆强化学习算法提供了理论基础，该算法旨在最小化在观察到的行为上定义的损失函数。结果表明，损失函数相对于模型参数的梯度是明确定义的，并且可以通过收缩图参数进行计算。在 Grid World 示例上对所提出的技术进行评估，

更新日期：2020-03-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>