Data-Driven Economic NMPC Using Reinforcement Learning,IEEE Transactions on Automatic Control

当前位置： X-MOL 学术 › IEEE Trans. Autom. Control › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Data-Driven Economic NMPC Using Reinforcement Learning
IEEE Transactions on Automatic Control ( IF 6.2 ) Pub Date : 5-24-2019 , DOI: 10.1109/tac.2019.2913768
Sebastien Gros , Mario Zanon

Reinforcement learning (RL) is a powerful tool to perform data-driven optimal control without relying on a model of the system. However, RL struggles to provide hard guarantees on the behavior of the resulting control scheme. In contrast, nonlinear model predictive control (NMPC) and economic NMPC (ENMPC) are standard tools for the closed-loop optimal control of complex systems with constraints and limitations, and benefit from a rich theory to assess their closed-loop behavior. Unfortunately, the performance of (E)NMPC hinges on the quality of the model underlying the control scheme. In this paper, we show that an (E)NMPC scheme can be tuned to deliver the optimal policy of the real system even when using a wrong model. This result also holds for real systems having stochastic dynamics. This entails that ENMPC can be used as a new type of function approximator within RL. Furthermore, we investigate our results in the context of ENMPC and formally connect them to the concept of dissipativity, which is central for the ENMPC stability. Finally, we detail how these results can be used to deploy classic RL tools for tuning (E)NMPC schemes. We apply these tools on both, a classical linear MPC setting and a standard nonlinear example, from the ENMPC literature.

中文翻译：

使用强化学习的数据驱动经济 NMPC

强化学习（RL）是一种强大的工具，可以在不依赖系统模型的情况下执行数据驱动的最优控制。然而，强化学习很难为最终控制方案的行为提供硬性保证。相比之下，非线性模型预测控制 (NMPC) 和经济 NMPC (ENMPC) 是对具有约束和限制的复杂系统进行闭环最优控制的标准工具，并受益于丰富的理论来评估其闭环行为。不幸的是，(E)NMPC 的性能取决于控制方案底层模型的质量。在本文中，我们展示了即使使用错误的模型，也可以调整 (E)NMPC 方案以提供实际系统的最优策略。该结果也适用于具有随机动力学的真实系统。这意味着 ENMPC 可以用作 RL 中的新型函数逼近器。此外，我们在 ENMPC 的背景下研究了我们的结果，并将它们与耗散性的概念正式联系起来，耗散性是 ENMPC 稳定性的核心。最后，我们详细介绍了如何使用这些结果来部署经典的 RL 工具来调整 (E)NMPC 方案。我们将这些工具应用于 ENMPC 文献中的经典线性 MPC 设置和标准非线性示例。

更新日期：2024-08-22

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11