当前位置: X-MOL 学术J. Commun. Netw. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Dynamic video delivery using deep reinforcement learning for device-to-device underlaid cache-enabled Internet-of-vehicle networks
Journal of Communications and Networks ( IF 2.9 ) Pub Date : 2021-04-16 , DOI: 10.23919/jcn.2021.000006
Minseok Choi , Myungjae Shin , Joongheon Kim

This paper addresses an Internet-of-vehicle network that utilizes a device-to-device (D2D) underlaid cellular system, where distributed caching at each vehicle is available and the video streaming service is provided via D2D links. Given the spectrum reuse policy, three decisions having different timescales in such a D2D underlaid cache-enabled vehicular network were investigated: 1) The decision on the cache-enabled vehicles for providing contents, 2) power allocation for D2D users, and 3) power allocation for cellular vehicles. Since wireless link activation for video delivery could introduce delays, node association is determined in a larger time scale compared to power allocations. We jointly optimize these delivery decisions by maximizing the average video quality under the constraints on the playback delays of streaming users and the data rate guarantees for cellular vehicles. Depending on the channel and queue states of users, the decision on the cache-enabled vehicle for video delivery is adaptively made based on the frame-based Lyapunov optimization theory by comparing the expected costs of vehicles. For each cache-enabled vehicle, the expected cost is obtained from the stochastic shortest path problem that is solved by deep reinforcement learning without the knowledge of global channel state information. Specifically, the deep deterministic policy gradient (DDPG) algorithm is adopted for dealing with the very large state space, i.e., time-varying channel states. Simulation results verify that the proposed video delivery algorithm achieves all the given goals, i.e., average video quality, smooth playback, and reliable data rates for cellular vehicles.

中文翻译:

使用深度强化学习为设备到设备底层缓存启用的车载Internet网络提供动态视频交付

本文介绍了一种利用设备对设备(D2D)嵌入式蜂窝系统的车载网络,该系统可在每辆车上使用分布式缓存,并通过D2D链接提供视频流服务。在给定频谱重用策略的情况下,研究了在此类具有D2D底层缓存功能的车载网络中具有不同时标的三个决策:1)有关提供内容的支持缓存的车辆的决策; 2)D2D用户的功率分配; 3)功率蜂窝车辆的分配。由于用于视频传送的无线链路激活可能会引入延迟,因此与功率分配相比,可以在更大的时间范围内确定节点关联。在流媒体用户的播放延迟和蜂窝车辆的数据速率保证的约束下,我们通过最大化平均视频质量来共同优化这些交付决策。根据用户的频道和队列状态,通过比较车辆的预期成本,基于基于帧的李雅普诺夫优化理论,自适应地做出用于视频传输的启用缓存的车辆的决策。对于每个启用了缓存的车辆,从随机最短路径问题中获得预期成本,该问题由深度强化学习解决而无需全局信道状态信息即可解决。具体地,采用深度确定性策略梯度(DDPG)算法来处理非常大的状态空间,即时变信道状态。
更新日期:2021-05-18
down
wechat
bug