Optimal Q-laws via reinforcement learning with guaranteed stability,Acta Astronautica

当前位置： X-MOL 学术 › Acta Astronaut. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Optimal Q-laws via reinforcement learning with guaranteed stability
Acta Astronautica ( IF 3.5 ) Pub Date : 2021-07-13 , DOI: 10.1016/j.actaastro.2021.07.010
Harry Holt ₁ , Roberto Armellin ₂ , Nicola Baresi ₁ , Yoshi Hashida ₃ , Andrea Turconi ₃ , Andrea Scorsoglio ₄ , Roberto Furfaro ₄

Affiliation

Closed-loop feedback-driven control laws can be used to solve low-thrust many-revolution trajectory design and guidance problems with minimal computational cost. Lyapunov-based control laws offer the benefits of increased stability whilst their optimality can be increased by tuning their parameters. In this paper, a reinforcement learning framework is used to make the parameters of the Lyapunov-based Q-law state-dependent, increasing its optimality. The Jacobian of these state-dependent parameters is available analytically and, unlike in other optimisation approaches, can be used to enforce stability throughout the transfer. The results focus on GTO–GEO and LEO–GEO transfers in Keplerian dynamics, including the effects of eclipses. The impact of the network architecture on the behaviour is investigated for both time- and mass-optimal transfers. Robustness to navigation errors and thruster misalignment is demonstrated using Monte Carlo analyses. The resulting approach offers potential for on-board autonomous transfers and orbit reconfiguration.

中文翻译：

通过具有保证稳定性的强化学习获得最佳 Q 律

闭环反馈驱动控制律可用于以最小的计算成本解决低推力多转轨迹设计和制导问题。基于李雅普诺夫的控制律提供了增加稳定性的好处，同时可以通过调整它们的参数来增加它们的最优性。在本文中，使用强化学习框架使基于李雅普诺夫的 Q 律的参数与状态相关，从而增加其最优性。这些与状态相关的参数的雅可比可用于分析，并且与其他优化方法不同，可用于在整个传输过程中增强稳定性。结果集中在开普勒动力学中的 GTO-GEO 和 LEO-GEO 转移上，包括日食的影响。针对时间和质量最优传输，研究了网络架构对行为的影响。使用蒙特卡罗分析证明了对导航错误和推进器未对准的鲁棒性。由此产生的方法为机载自主转移和轨道重新配置提供了潜力。

更新日期：2021-07-18

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>