Safe model-based reinforcement learning for nonlinear optimal control with state and input constraints,AIChE Journal

当前位置： X-MOL 学术 › AlChE J. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Safe model-based reinforcement learning for nonlinear optimal control with state and input constraints
AIChE Journal ( IF 3.7 ) Pub Date : 2022-01-18 , DOI: 10.1002/aic.17601
Yeonsoo Kim ₁ , Jong Woo Kim ₂

Affiliation

Safety is a critical factor in reinforcement learning (RL) in chemical processes. In our previous work, we had proposed a new stability-guaranteed RL for unconstrained nonlinear control-affine systems. In the approximate policy iteration algorithm, a Lyapunov neural network (LNN) was updated while being restricted to the control Lyapunov function, and a policy was updated using a variation of Sontag's formula. In this study, we additionally consider state and input constraints by introducing a barrier function, and we extend the applicable type to general nonlinear systems. We augment the constraints into the objective function and use the LNN added with a Lyapunov barrier function to approximate the augmented value function. Sontag's formula input with this approximate function brings the states into its lower level set, thereby guaranteeing the constraints satisfaction and stability. We prove the practical asymptotic stability and forward invariance. The effectiveness is validated using four tank system simulations.

中文翻译：

基于安全模型的强化学习，用于具有状态和输入约束的非线性最优控制

安全性是化学过程中强化学习 (RL) 的一个关键因素。在我们之前的工作中，我们为无约束非线性控制仿射系统提出了一种新的稳定性保证 RL。在近似策略迭代算法中，Lyapunov 神经网络 (LNN) 在受限于控制 Lyapunov 函数的同时进行了更新，并使用 Sontag 公式的变体更新了策略。在这项研究中，我们通过引入障碍函数来额外考虑状态和输入约束，并将适用类型扩展到一般非线性系统。我们将约束增强到目标函数中，并使用添加了 Lyapunov 障碍函数的 LNN 来逼近增强值函数。使用此近似函数的桑塔格公式输入将状态带入其较低水平的集合，从而保证约束的满足和稳定性。我们证明了实际的渐近稳定性和前向不变性。使用四个罐系统模拟验证了有效性。

更新日期：2022-01-18

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>