Learning Control Barrier Functions from Expert Demonstrations,arXiv - CS - Systems and Control

当前位置： X-MOL 学术 › arXiv.cs.SY › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Learning Control Barrier Functions from Expert Demonstrations
arXiv - CS - Systems and Control Pub Date : 2020-04-07 , DOI: arxiv-2004.03315
Alexander Robey, Haimin Hu, Lars Lindemann, Hanwen Zhang, Dimos V. Dimarogonas, Stephen Tu, Nikolai Matni

Inspired by the success of imitation and inverse reinforcement learning in replicating expert behavior through optimal control, we propose a learning based approach to safe controller synthesis based on control barrier functions (CBFs). We consider the setting of a known nonlinear control affine dynamical system and assume that we have access to safe trajectories generated by an expert - a practical example of such a setting would be a kinematic model of a self-driving vehicle with safe trajectories (e.g., trajectories that avoid collisions with obstacles in the environment) generated by a human driver. We then propose and analyze an optimization-based approach to learning a CBF that enjoys provable safety guarantees under suitable Lipschitz smoothness assumptions on the underlying dynamical system. A strength of our approach is that it is agnostic to the parameterization used to represent the CBF, assuming only that the Lipschitz constant of such functions can be efficiently bounded. Furthermore, if the CBF parameterization is convex, then under mild assumptions, so is our learning process. We end with extensive numerical evaluations of our results on both planar and realistic examples, using both random feature and deep neural network parameterizations of the CBF. To the best of our knowledge, these are the first results that learn provably safe control barrier functions from data.

中文翻译：

从专家演示中学习控制障碍函数

受到模仿和逆强化学习在通过最优控制复制专家行为的成功的启发，我们提出了一种基于控制屏障函数 (CBF) 的安全控制器合成方法。我们考虑了一个已知的非线性控制仿射动力系统的设置，并假设我们可以访问由专家生成的安全轨迹——这种设置的一个实际例子是具有安全轨迹的自动驾驶车辆的运动学模型（例如，避免与环境中障碍物碰撞的轨迹）由人类驾驶员生成。然后，我们提出并分析了一种基于优化的方法来学习 CBF，该方法在底层动态系统的合适 Lipschitz 平滑假设下享有可证明的安全保证。我们方法的一个优势在于它与用于表示 CBF 的参数化无关，假设仅可以有效地限制此类函数的 Lipschitz 常数。此外，如果 CBF 参数化是凸的，那么在温和的假设下，我们的学习过程也是凸的。最后，我们使用 CBF 的随机特征和深度神经网络参数化对平面和现实示例的结果进行了广泛的数值评估。据我们所知，这些是从数据中学习可证明安全的控制屏障功能的第一个结果。最后，我们使用 CBF 的随机特征和深度神经网络参数化对平面和现实示例的结果进行了广泛的数值评估。据我们所知，这些是从数据中学习可证明安全的控制屏障功能的第一个结果。最后，我们使用 CBF 的随机特征和深度神经网络参数化对平面和现实示例的结果进行了广泛的数值评估。据我们所知，这些是从数据中学习可证明安全的控制屏障功能的第一个结果。

更新日期：2020-11-10

点击分享查看原文

点击收藏

阅读更多本刊最新论文