当前位置: X-MOL 学术J. Aerosp. Inf. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Autonomous Delay Tolerant Network Management Using Reinforcement Learning
Journal of Aerospace Information Systems ( IF 1.5 ) Pub Date : 2021-03-29 , DOI: 10.2514/1.i010920
Pau Garcia Buzzi 1 , Daniel Selva 1 , Marc Sanchez Net 2
Affiliation  

Delay tolerant networks (DTNs) offer a set of standardized protocols to enable Internet-like connectivity across the solar system. Unlike other protocols such as the Transmission Control Protocol (TCP) and the Internet Protocol (IP), DTN protocols are robust to end-to-end connection disruptions and long delays. Although the behavior of DTN core protocols is well understood, management of DTNs is still an area of active research. This paper uses reinforcement learning (RL) to automate the management of a DTN node consisting of an orbital relay between the moon and Earth. More specifically, the RL agent is in charge of deciding when to drop packets, when to change the data rate of the neighbor node links, when to reroute bundles to crosslinks, or when not to change any network parameter. The agent’s goal is to maximize the bits received by the Deep Space Network while minimizing the capacity allocated to all controlled links, and control the buffer utilization to avoid memory overflows. To assess the potential of using RL in DTN management, the performance of the trained RL agent is benchmarked against other non-RL-based policies in a realistic lunar scenario. Results show that the RL agent provides the highest reward, outperforming all non-RL policies in this scenario.



中文翻译:

使用强化学习的自主延迟容忍网络管理

时延容忍网络(DTN)提供了一组标准化协议,可以在整个太阳能系统中实现类似于Internet的连接。与其他协议(例如,传输控制协议(TCP)和Internet协议(IP))不同,DTN协议对于端到端连接中断和长时间延迟具有强大的鲁棒性。尽管DTN核心协议的行为已广为人知,但DTN的管理仍是一个活跃的研究领域。本文使用强化学习(RL)来自动管理由月球和地球之间的轨道中继组成的DTN节点。更具体地说,RL代理负责确定何时丢弃数据包,何时更改邻居节点链路的数据速率,何时将束重新路由到交叉链路,或者何时不更改任何网络参数。该代理程序的目标是最大化深空网络接收到的比特,同时最小化分配给所有受控链路的容量,并控制缓冲区利用率以避免内存溢出。为了评估在DTN管理中使用RL的潜力,在实际的月球场景中,将受过训练的RL代理的性能与其他基于非RL的策略进行基准比较。结果表明,RL代理提供了最高的奖励,在这种情况下胜过所有非RL策略。

更新日期:2021-03-30
down
wechat
bug