当前位置: X-MOL 学术IEEE Trans. Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Double Coded Caching in Ultra Dense Networks: Caching and Multicast Scheduling via Deep Reinforcement Learning
IEEE Transactions on Communications ( IF 7.2 ) Pub Date : 2020-02-01 , DOI: 10.1109/tcomm.2019.2955490
Zhengming Zhang , Hongyang Chen , Meng Hua , Chunguo Li , Yongming Huang , Luxi Yang

Proposed by Maddah-Ali and Niesen, a coded caching scheme has been verified to alleviate the load of networks efficiently. Recently, a new technique called placement delivery array (PDA) was proposed to characterize the coded caching scheme. In this paper, we consider a caching system in the scope of ultra dense networks (UDNs). Each base station (BS) has a finite cache and stores some contents. We propose an efficient coded content caching scheme called double coded caching to make the transmission robust to in-and-out wireless network quality. Then the dynamic caching and multicast scheduling are considered to jointly minimize the average delay and power of the content-centric wireless networks. This stochastic optimization problem can be formulated as a Markov decision process (MDP) with unknown transition probabilities and large state space. We propose a deep reinforcement learning approach to deal with the decision problem. Our algorithm uses a variational auto-encoder (VAE) neural network to approximate the state sufficiently, and uses a weighted double Q-learning scheme to reduce variance and overestimation of the Q function. Numerical results demonstrate that the proposed double coded caching scheme increases the probability of the successful transmission, and the caching and scheduling policy can effectively reduce the delay and the power consumption.

中文翻译:

超密集网络中的双编码缓存:通过深度强化学习进行缓存和多播调度

由 Maddah-Ali 和 Niesen 提出,一种编码缓存方案已被验证可以有效减轻网络负载。最近,提出了一种称为放置交付阵列 (PDA) 的新技术来表征编码缓存方案。在本文中,我们考虑了超密集网络 (UDN) 范围内的缓存系统。每个基站(BS)都有一个有限的缓存并存储一些内容。我们提出了一种称为双编码缓存的高效编码内容缓存方案,以使传输对进出无线网络质量具有鲁棒性。然后考虑动态缓存和组播调度来共同最小化以内容为中心的无线网络的平均延迟和功率。这个随机优化问题可以表述为具有未知转移概率和大状态空间的马尔可夫决策过程 (MDP)。我们提出了一种深度强化学习方法来处理决策问题。我们的算法使用变分自动编码器 (VAE) 神经网络来充分逼近状态,并使用加权双 Q 学习方案来减少 Q 函数的方差和高估。数值结果表明,所提出的双编码缓存方案增加了传输成功的概率,缓存和调度策略可以有效降低延迟和功耗。
更新日期:2020-02-01
down
wechat
bug