Reinforcement Learning-Based Control for Nonlinear Discrete-Time Systems with Unknown Control Directions and Control Constraints,Neurocomputing

当前位置： X-MOL 学术 › Neurocomputing › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Reinforcement Learning-Based Control for Nonlinear Discrete-Time Systems with Unknown Control Directions and Control Constraints
Neurocomputing ( IF 5.5 ) Pub Date : 2020-08-01 , DOI: 10.1016/j.neucom.2020.03.061
Miao Huang , Cong Liu , Xiaoqi He , Longhua Ma , Zheming Lu , Hongye Su

Abstract In this work, output-feedback control problems for a class of discrete-time non-affine nonlinear systems with unknown control directions and input constraints are considered by using reinforcement learning (RL) method. Two neural networks (NNs) implement the control: 1) a critic NN that estimates a non-quadratic strategic utility function (SUF) and 2) an action NN that generates optimized control input and minimizes the SUF. The implicit function theorem is applied to obtain the optimal control law since the control is appeared in a non-affine form. For the first time, the discrete Nussbaum gain is introduced to overcome the difficulty that the control directions are unknown and a non-quadratic SUF is used to deal with the control constraints in the RL-based control. The theoretical derivation of the uniformly ultimately boundedness of the NN weights and the closed-loop output tracking error is given. And two numerical examples have been supplied to valid the proposed method.

中文翻译：

具有未知控制方向和控制约束的非线性离散时间系统的基于强化学习的控制

摘要在这项工作中，使用强化学习（RL）方法考虑了一类具有未知控制方向和输入约束的离散时间非仿射非线性系统的输出反馈控制问题。两个神经网络 (NN) 实现控制：1) 估计非二次战略效用函数 (SUF) 的评论神经网络和 2) 生成优化控制输入并最小化 SUF 的动作神经网络。由于控制以非仿射形式出现，因此应用隐函数定理来获得最优控制律。首次引入离散 Nussbaum 增益来克服控制方向未知的困难，并使用非二次 SUF 来处理基于 RL 的控制中的控制约束。给出了NN权重的一致最终有界性和闭环输出跟踪误差的理论推导。并提供了两个数值例子来验证所提出的方法。

更新日期：2020-08-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11