当前位置: X-MOL 学术Annu. Rev. Control › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
From inverse optimal control to inverse reinforcement learning: A historical review
Annual Reviews in Control ( IF 7.3 ) Pub Date : 2020-06-26 , DOI: 10.1016/j.arcontrol.2020.06.001
Nematollah Ab Azar , Aref Shahmansoorian , Mohsen Davoudi

Inverse optimal control (IOC) is a powerful theory that addresses the inverse problems in control systems, robotics, Machine Learning (ML) and optimization taking into account the optimal manners. This paper reviews the history of the IOC and Inverse Reinforcement Learning (IRL) approaches and describes the connections and differences between them to cover the research gap in the existing literature. The general formulation of IOC/IRL is described and the related methods are categorized based on a hierarchical approach. For this purpose, IOC methods are categorized under two classes, namely classic and modern approaches. The classic IOC is typically formulated for control systems, while IRL, as a modern approach to IOC, is considered for machine learning problems. Despite the presence of a handful of IOC/IRL methods, a comprehensive categorization of these methods is lacking. In addition to the IOC/IRL problems, this paper elaborates, where necessary, on other relevant concepts such as Learning from Demonstration (LfD), Imitation Learning (IL), and Behavioral Cloning. Some of the challenges encountered in the IOC/IRL problems are further discussed in this work, including ill-posedness, non-convexity, data availability, non-linearity, the curses of complexity and dimensionality, feature selection, and generalizability.



中文翻译:

从逆最优控制到逆强化学习:历史回顾

逆最优控制(IOC)是一个强大的理论,它考虑了最优方式,解决了控制系统,机器人,机器学习(ML)和优化中的逆问题。本文回顾了IOC和逆向强化学习(IRL)方法的历史,并描述了它们之间的联系和差异,以弥补现有文献中的研究空白。描述了IOC / IRL的一般公式,并基于分层方法对相关方法进行了分类。为此,将IOC方法分为两类,即经典方法和现代方法。经典的IOC通常是为控制系统制定的,而IRL作为IOC的现代方法,则被认为用于机器学习问题。尽管有一些IOC / IRL方法,缺乏对这些方法的全面分类。除了IOC / IRL问题外,必要时,本文还阐述了其他相关概念,例如从示范学习(LfD),模仿学习(IL)和行为克隆。在这项工作中,将进一步讨论IOC / IRL问题中遇到的一些挑战,包括不适定性,非凸性,数据可用性,非线性,复杂度和维数的诅咒,特征选择和可概括性。

更新日期:2020-06-26
down
wechat
bug