当前位置:
X-MOL 学术
›
arXiv.cs.SY
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Reinforcement Learning for Mixed-Integer Problems Based on MPC
arXiv - CS - Systems and Control Pub Date : 2020-04-03 , DOI: arxiv-2004.01430 Sebastien Gros, Mario Zanon
arXiv - CS - Systems and Control Pub Date : 2020-04-03 , DOI: arxiv-2004.01430 Sebastien Gros, Mario Zanon
Model Predictive Control has been recently proposed as policy approximation
for Reinforcement Learning, offering a path towards safe and explainable
Reinforcement Learning. This approach has been investigated for Q-learning and
actor-critic methods, both in the context of nominal Economic MPC and Robust
(N)MPC, showing very promising results. In that context, actor-critic methods
seem to be the most reliable approach. Many applications include a mixture of
continuous and integer inputs, for which the classical actor-critic methods
need to be adapted. In this paper, we present a policy approximation based on
mixed-integer MPC schemes, and propose a computationally inexpensive technique
to generate exploration in the mixed-integer input space that ensures a
satisfaction of the constraints. We then propose a simple compatible advantage
function approximation for the proposed policy, that allows one to build the
gradient of the mixed-integer MPC-based policy.
中文翻译:
基于 MPC 的混合整数问题强化学习
模型预测控制最近被提出作为强化学习的策略近似,为安全和可解释的强化学习提供了一条途径。已经在名义经济 MPC 和稳健 (N) MPC 的背景下对 Q-learning 和 actor-critic 方法研究了这种方法,显示出非常有希望的结果。在这种情况下,actor-critic 方法似乎是最可靠的方法。许多应用程序包括连续和整数输入的混合,需要适应经典的演员-评论家方法。在本文中,我们提出了一种基于混合整数 MPC 方案的策略近似,并提出了一种计算成本低廉的技术,以在混合整数输入空间中生成探索,以确保满足约束条件。
更新日期:2020-04-06
中文翻译:
基于 MPC 的混合整数问题强化学习
模型预测控制最近被提出作为强化学习的策略近似,为安全和可解释的强化学习提供了一条途径。已经在名义经济 MPC 和稳健 (N) MPC 的背景下对 Q-learning 和 actor-critic 方法研究了这种方法,显示出非常有希望的结果。在这种情况下,actor-critic 方法似乎是最可靠的方法。许多应用程序包括连续和整数输入的混合,需要适应经典的演员-评论家方法。在本文中,我们提出了一种基于混合整数 MPC 方案的策略近似,并提出了一种计算成本低廉的技术,以在混合整数输入空间中生成探索,以确保满足约束条件。