Inverse Reinforcement Learning in Tracking Control Based on Inverse Optimal Control.,IEEE Transactions on Cybernetics

当前位置： X-MOL 学术 › IEEE Trans. Cybern. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Inverse Reinforcement Learning in Tracking Control Based on Inverse Optimal Control.
IEEE Transactions on Cybernetics ( IF 9.4 ) Pub Date : 2021-04-20 , DOI: 10.1109/tcyb.2021.3062856
Wenqian Xue , Patrik Kolaric , Jialu Fan , Bosen Lian , Tianyou Chai , Frank L. Lewis

This article provides a novel inverse reinforcement learning (RL) algorithm that learns an unknown performance objective function for tracking control. The algorithm combines three steps: 1) an optimal control update; 2) a gradient descent correction step; and 3) an inverse optimal control (IOC) update. The new algorithm clarifies the relation between inverse RL and IOC. It is shown that the reward weight of an unknown performance objective that generates a target control policy may not be unique. We characterize the set of all weights that generate the same target control policy. We develop a model-based algorithm and, further, two model-free algorithms for systems with unknown model information. Finally, simulation experiments are presented to show the effectiveness of the proposed algorithms.

中文翻译：

基于逆最优控制的跟踪控制逆强化学习。

本文提供了一种新颖的逆强化学习（RL）算法，该算法学习用于跟踪控制的未知性能目标函数。该算法包括三个步骤：1）最优控制更新；2）梯度下降校正步骤；3）反向最优控制（IOC）更新。新算法阐明了逆RL和IOC之间的关系。结果表明，生成目标控制策略的未知绩效目标的奖励权重可能不是唯一的。我们描述了生成相同目标控制策略的所有权重的集合。我们针对模型信息未知的系统开发了基于模型的算法，以及另外两种无模型算法。最后，通过仿真实验证明了所提算法的有效性。

更新日期：2021-04-20

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11