当前位置: X-MOL 学术IEEE Robot. Automation Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Direct Policy Optimization Using Deterministic Sampling and Collocation
IEEE Robotics and Automation Letters ( IF 5.2 ) Pub Date : 2021-03-25 , DOI: 10.1109/lra.2021.3068890
Taylor Howell , Chunjiang Fu , Zachary Manchester

We present an approach for approximately solving discrete-time stochastic optimal-control problems by combining direct trajectory optimization, deterministic sampling, and policy optimization. Our feedback motion-planning algorithm uses a quasi-Newton method to simultaneously optimize a reference trajectory, a set of deterministically chosen sample trajectories, and a parameterized policy. We demonstrate that this approach exactly recovers LQR policies in the case of linear dynamics, quadratic objective, and Gaussian disturbances. We also demonstrate the algorithm on several nonlinear, underactuated robotic systems to highlight its performance and ability to handle control limits, safely avoid obstacles, and generate robust plans in the presence of unmodeled dynamics.

中文翻译:

使用确定性抽样和配置的直接策略优化

我们提出了一种通过组合直接轨迹优化,确定性采样和策略优化来近似解决离散时间随机最优控制问题的方法。我们的反馈运动计划算法使用准牛顿法来同时优化参考轨迹,一组确定性选择的样本轨迹和参数化策略。我们证明,在线性动力学,二次目标和高斯扰动的情况下,该方法可以准确地恢复LQR策略。我们还将在几种非线性,驱动不足的机器人系统上演示该算法,以突出其性能和处理控制限制,安全地避开障碍物并在存在未建模动力学的情况下生成可靠计划的能力。
更新日期:2021-05-07
down
wechat
bug