BeiDou Short-Message Satellite Resource Allocation Algorithm Based on Deep Reinforcement Learning,Entropy

当前位置： X-MOL 学术 › Entropy › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

BeiDou Short-Message Satellite Resource Allocation Algorithm Based on Deep Reinforcement Learning
Entropy ( IF 2.7 ) Pub Date : 2021-07-22 , DOI: 10.3390/e23080932
Kaiwen Xia ₁ , Jing Feng ₁ , Chao Yan _{1,

2} , Chaofan Duan ₁

Affiliation

The comprehensively completed BDS-3 short-message communication system, known as the short-message satellite communication system (SMSCS), will be widely used in traditional blind communication areas in the future. However, short-message processing resources for short-message satellites are relatively scarce. To improve the resource utilization of satellite systems and ensure the service quality of the short-message terminal is adequate, it is necessary to allocate and schedule short-message satellite processing resources in a multi-satellite coverage area. In order to solve the above problems, a short-message satellite resource allocation algorithm based on deep reinforcement learning (DRL-SRA) is proposed. First of all, using the characteristics of the SMSCS, a multi-objective joint optimization satellite resource allocation model is established to reduce short-message terminal path transmission loss, and achieve satellite load balancing and an adequate quality of service. Then, the number of input data dimensions is reduced using the region division strategy and a feature extraction network. The continuous spatial state is parameterized with a deep reinforcement learning algorithm based on the deep deterministic policy gradient (DDPG) framework. The simulation results show that the proposed algorithm can reduce the transmission loss of the short-message terminal path, improve the quality of service, and increase the resource utilization efficiency of the short-message satellite system while ensuring an appropriate satellite load balance.

中文翻译：

基于深度强化学习的北斗短消息卫星资源分配算法

全面建成的北斗三号短信息通信系统，即短信息卫星通信系统（SMSCS），未来将广泛应用于传统的盲通信领域。但是，短消息卫星的短消息处理资源相对匮乏。为了提高卫星系统的资源利用率，保证短消息终端的服务质量足够，需要在多卫星覆盖区域内分配和调度短消息卫星处理资源。为了解决上述问题，提出了一种基于深度强化学习的短消息卫星资源分配算法（DRL-SRA）。首先，利用SMSCS的特点，建立多目标联合优化卫星资源分配模型，减少短消息终端路径传输损耗，实现卫星负载均衡和足够的服务质量。然后，使用区域划分策略和特征提取网络来减少输入数据的维数。连续空间状态使用基于深度确定性策略梯度 (DDPG) 框架的深度强化学习算法进行参数化。仿真结果表明，该算法在保证适当卫星负载均衡的同时，能够降低短消息终端路径的传输损耗，提高服务质量，提高短消息卫星系统的资源利用效率。并实现卫星负载平衡和足够的服务质量。然后，使用区域划分策略和特征提取网络来减少输入数据的维数。连续空间状态使用基于深度确定性策略梯度 (DDPG) 框架的深度强化学习算法进行参数化。仿真结果表明，该算法在保证适当卫星负载均衡的同时，能够降低短消息终端路径的传输损耗，提高服务质量，提高短消息卫星系统的资源利用效率。并实现卫星负载平衡和足够的服务质量。然后，使用区域划分策略和特征提取网络来减少输入数据的维数。连续空间状态使用基于深度确定性策略梯度 (DDPG) 框架的深度强化学习算法进行参数化。仿真结果表明，该算法在保证适当卫星负载均衡的同时，能够降低短消息终端路径的传输损耗，提高服务质量，提高短消息卫星系统的资源利用效率。连续空间状态使用基于深度确定性策略梯度 (DDPG) 框架的深度强化学习算法进行参数化。仿真结果表明，该算法在保证适当卫星负载均衡的同时，能够降低短消息终端路径的传输损耗，提高服务质量，提高短消息卫星系统的资源利用效率。连续空间状态使用基于深度确定性策略梯度 (DDPG) 框架的深度强化学习算法进行参数化。仿真结果表明，该算法在保证适当卫星负载均衡的同时，能够降低短消息终端路径的传输损耗，提高服务质量，提高短消息卫星系统的资源利用效率。

更新日期：2021-07-22

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>