当前位置: X-MOL 学术IEEE Trans. Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multi-Agent Deep Reinforcement Learning-Based Cooperative Edge Caching for Ultra-Dense Next-Generation Networks
IEEE Transactions on Communications ( IF 7.2 ) Pub Date : 2020-12-14 , DOI: 10.1109/tcomm.2020.3044298
Shuangwu Chen 1 , Zhen Yao 1 , Xiaofeng Jiang 1 , Jian Yang 1 , Lajos Hanzo 2
Affiliation  

The soaring mobile data traffic demands have spawned the innovative concept of mobile edge caching in ultra-dense next-generation networks, which mitigates their heavy traffic burden. We conceive cooperative content sharing between base stations (BSs) for improving the exploitation of the limited storage of a single edge cache. We formulate the cooperative caching problem as a partially observable Markov decision process (POMDP) based multi-agent decision problem, which jointly optimizes the costs of fetching contents from the local BS, from the nearby BSs and from the remote servers. To solve this problem, we devise a multi-agent actor-critic framework, where a communication module is introduced to extract and share the variability of the actions and observations of all BSs. To beneficially exploit the spatio-temporal differences of the content popularity, we harness a variational recurrent neural network (VRNN) for estimating the time-variant popularity distribution in each BS. Based on multi-agent deep reinforcement learning, we conceive a cooperative edge caching algorithm where the BSs operate cooperatively, since the distributed decision making of each agent depends on both the local and the global states. Our experiments conducted within a large scale cellular network having numerous BSs reveal that the proposed algorithm relying on the collaboration of BSs substantially improves the benefits of edge caches.

中文翻译:

基于多主体深度强化学习的协作边缘缓存用于超密集下一代网络

不断增长的移动数据流量需求催生了超密集下一代网络中移动边缘缓存的创新概念,从而减轻了它们繁重的流量负担。我们构想基站(BS)之间的协作内容共享,以改善对单个边缘缓存的有限存储的利用。我们将协作缓存问题公式化为基于部分可观察到的马尔可夫决策过程(POMDP)的多主体决策问题,该问题共同优化了从本地BS,附近的BS和远程服务器获取内容的成本。为了解决这个问题,我们设计了一个多主体参与者批判框架,其中引入了一个通信模块来提取和共享所有BS的动作和观察结果的可变性。为了有益地利用内容流行度的时空差异,我们利用变分递归神经网络(VRNN)估算每个BS中的时变流行度分布。基于多智能体深度强化学习,我们设想了一种协作边缘缓存算法,其中BS可以协同操作,因为每个智能体的分布式决策都取决于本地和全局状态。我们在具有大量BS的大规模蜂窝网络中进行的实验表明,所提出的依赖BS协作的算法大大提高了边缘缓存的优势。我们设想了一种协作边缘缓存算法,其中BS可以协同工作,因为每个代理的分布式决策都取决于本地和全局状态。我们在具有大量BS的大规模蜂窝网络中进行的实验表明,所提出的依赖BS协作的算法大大提高了边缘缓存的优势。我们设想了一种协作边缘缓存算法,其中BS可以协同工作,因为每个代理的分布式决策都取决于本地和全局状态。我们在具有大量BS的大规模蜂窝网络中进行的实验表明,所提出的依赖BS协作的算法大大提高了边缘缓存的优势。
更新日期:2020-12-14
down
wechat
bug