当前位置: X-MOL 学术Eng. Appl. Artif. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Deep replacement: Reinforcement learning based constellation management and autonomous replacement
Engineering Applications of Artificial Intelligence ( IF 8 ) Pub Date : 2021-06-09 , DOI: 10.1016/j.engappai.2021.104316
Joseph Kopacz , Jason Roney , Roman Herschitz

The Deep Reinforcement Learning (DRL) algorithm, Proximal Policy Optimization (PPO2), is deployed on a custom spacecraft (S/C) build and loss model to determine if an Artificial Intelligence (AI) can learn to monitor satellite constellation health and determine an optimal replacement strategy. A custom environment is created to simulate how S/C are built, launched, generate revenue, and finally decay. The reinforcement learning agent successfully learned an optimal policy for two models: a Simplified Model where the financial cost of actions is ignored; and an Advanced Model where the financial cost of actions is a major element. In both models the AI monitors the constellations and takes multiple strategic and tactical actions to replace satellites to maintain constellation performance. The Simplified Model showed that the PPO2 algorithm was able to converge on an optimal solution after 200,000 simulations. The Advanced Model was much more difficult for the AI to learn, and thus, the performance drops during the early episodes, but eventually converges to an optimal policy at 25,000,000 simulations. With the Advanced Model, the AI is taking actions that are successfully providing strategies for constellation management and satellite replacements which include these actions’ financial implications. Thus, the methods in this paper provide initial research developments towards a real-world tool and an AI application that can aid various Aerospace businesses in managing Low Earth Orbit (LEO) constellations. This type of AI application may become imperative for deploying and maintaining small satellite mega-constellations.



中文翻译:

深度替换:基于强化学习的星座管理和自主替换

深度强化学习 (DRL) 算法、近端策略优化 (PPO2) 部署在自定义航天器 (S/C) 构建和丢失模型上,以确定人工智能 (AI) 是否可以学习监控卫星星座健康状况并确定最优替换策略。创建自定义环境来模拟 S/C 如何构建、启动、产生收入和最终衰减。强化学习代理成功地为两个模型学习了最佳策略:一个简化模型,其中忽略了行动的财务成本;和高级模型,其中行动的财务成本是主要因素。在这两种模型中,人工智能都会监控星座并采取多种战略和战术行动来替换卫星以保持星座性能。200,000 次模拟。AI 学习高级模型要困难得多,因此,性能在早期阶段下降,但最终收敛到最佳策略25,000,000 次模拟。借助高级模型,人工智能正在采取行动,成功地为星座管理和卫星更换提供战略,其中包括这些行动的财务影响。因此,本文中的方法提供了对现实世界工具和人工智能应用程序的初步研究进展,这些工具和人工智能应用程序可以帮助各种航空航天企业管理低地球轨道 (LEO) 星座。这种类型的 AI 应用程序可能成为部署和维护小型卫星巨型星座的必要条件。

更新日期:2021-06-09
down
wechat
bug