Reinforcement Learning for Dynamic Resource Optimization in 5G Radio Access Network Slicing,arXiv - CS - Networking and Internet Architecture

当前位置： X-MOL 学术 › arXiv.cs.NI › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Reinforcement Learning for Dynamic Resource Optimization in 5G Radio Access Network Slicing
arXiv - CS - Networking and Internet Architecture Pub Date : 2020-09-14 , DOI: arxiv-2009.06579
Yi Shi, Yalin E. Sagduyu, Tugba Erpek

The paper presents a reinforcement learning solution to dynamic resource allocation for 5G radio access network slicing. Available communication resources (frequency-time blocks and transmit powers) and computational resources (processor usage) are allocated to stochastic arrivals of network slice requests. Each request arrives with priority (weight), throughput, computational resource, and latency (deadline) requirements, and if feasible, it is served with available communication and computational resources allocated over its requested duration. As each decision of resource allocation makes some of the resources temporarily unavailable for future, the myopic solution that can optimize only the current resource allocation becomes ineffective for network slicing. Therefore, a Q-learning solution is presented to maximize the network utility in terms of the total weight of granted network slicing requests over a time horizon subject to communication and computational constraints. Results show that reinforcement learning provides major improvements in the 5G network utility relative to myopic, random, and first come first served solutions. While reinforcement learning sustains scalable performance as the number of served users increases, it can also be effectively used to assign resources to network slices when 5G needs to share the spectrum with incumbent users that may dynamically occupy some of the frequency-time blocks.

中文翻译：

5G 无线接入网切片中动态资源优化的强化学习

本文提出了一种针对 5G 无线接入网络切片的动态资源分配的强化学习解决方案。可用的通信资源（频率-时间块和发射功率）和计算资源（处理器使用情况）被分配给随机到达的网络切片请求。每个请求到达时都具有优先级（权重）、吞吐量、计算资源和延迟（截止日期）要求，并且如果可行，则在其请求的持续时间内为其分配可用的通信和计算资源。由于每次资源分配决策都会使部分资源暂时无法用于未来，因此只能优化当前资源分配的短视解决方案对于网络切片变得无效。所以，提出了一种 Q-learning 解决方案，以在受通信和计算约束的时间范围内根据授予的网络切片请求的总权重来最大化网络效用。结果表明，相对于短视、随机和先到先得的解决方案，强化学习在 5G 网络实用程序方面提供了重大改进。虽然强化学习随着服务用户数量的增加而保持可扩展的性能，但当 5G 需要与可能动态占用某些频率时间块的现有用户共享频谱时，它也可以有效地用于为网络切片分配资源。结果表明，相对于短视、随机和先到先得的解决方案，强化学习在 5G 网络实用程序方面提供了重大改进。虽然强化学习随着服务用户数量的增加而保持可扩展的性能，但当 5G 需要与可能动态占用某些频率时间块的现有用户共享频谱时，它也可以有效地用于为网络切片分配资源。结果表明，相对于短视、随机和先到先得的解决方案，强化学习在 5G 网络实用程序方面提供了重大改进。虽然强化学习随着服务用户数量的增加而保持可扩展的性能，但当 5G 需要与可能动态占用某些频率时间块的现有用户共享频谱时，它也可以有效地用于为网络切片分配资源。

更新日期：2020-09-15

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>