当前位置: X-MOL 学术J. Aerosp. Inf. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Autonomous Six-Degree-of-Freedom Spacecraft Docking with Rotating Targets via Reinforcement Learning
Journal of Aerospace Information Systems ( IF 1.3 ) Pub Date : 2021-04-28 , DOI: 10.2514/1.i010914
Charles E. Oestreich 1 , Richard Linares 1 , Ravi Gondhalekar 2
Affiliation  

A policy for six-degree-of-freedom docking maneuvers with rotating targets is developed through reinforcement learning and implemented as a feedback control law. Potential clients for satellite servicing and orbital debris objects are often rotating around a constant axis within their respective Earth orbits. In the context of such missions, reinforcement learning provides an appealing framework for robust, autonomous maneuvers in uncertain environments with low on-board computational cost. This work uses proximal policy optimization to produce a docking policy for rotating or nonrotating targets that is valid over a portion of the six-degree-of-freedom state space while striving to minimize performance and control costs. Experiments using the simulated Apollo transposition and docking maneuver with an induced spin in the lunar module exhibit the policy’s capabilities and provide a comparison with standard optimal control techniques. Furthermore, specific challenges and workarounds, as well as a discussion on the benefits and disadvantages of reinforcement learning for docking policies, are discussed to facilitate future research. As such, this work will serve as a foundation for further investigation of learning-based control laws for spacecraft proximity operations in uncertain environments.



中文翻译:

通过强化学习与旋转目标自动对接的六自由度航天器

通过强化学习,制定了具有旋转目标的六自由度对接机动策略,并将其作为反馈控制法则实施。卫星服务和轨道碎片物体的潜在客户通常在其各自的地球轨道内绕恒定轴旋转。在此类任务的背景下,强化学习提供了一个诱人的框架,可在机载计算成本低的不确定环境中进行强大,自主的演习。这项工作使用近端策略优化来生成针对旋转或非旋转目标的对接策略,该策略在六自由度状态空间的一部分上有效,同时力求最大程度地降低性能和控制成本。在月球舱中使用模拟的阿波罗换位和对接操纵以及感应的自旋进行的实验展示了该策略的功能,并与标准最佳控制技术进行了比较。此外,还讨论了具体的挑战和变通办法,以及关于加强学习对接政策的优缺点的讨论,以促进将来的研究。因此,这项工作将为进一步研究不确定环境中航天器接近操作的基于学习的控制规律奠定基础。讨论以方便将来的研究。因此,这项工作将为进一步研究不确定环境中航天器接近操作的基于学习的控制规律奠定基础。讨论以方便将来的研究。因此,这项工作将为进一步研究不确定环境中航天器接近操作的基于学习的控制规律奠定基础。

更新日期:2021-04-29
down
wechat
bug