Caching Transient Content for IoT Sensing: Multi-Agent Soft Actor-Critic,IEEE Transactions on Communications

当前位置： X-MOL 学术 › IEEE Trans. Commun. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Caching Transient Content for IoT Sensing: Multi-Agent Soft Actor-Critic
IEEE Transactions on Communications ( IF 7.2 ) Pub Date : 2021-06-04 , DOI: 10.1109/tcomm.2021.3086535
Xiongwei Wu , Xiuhua Li , Jun Li , P. C. Ching , Victor C. M. Leung , H. Vincent Poor

Edge nodes (ENs) in Internet of Things commonly serve as gateways to cache sensing data while providing accessing services for data consumers. This paper considers multiple ENs that cache sensing data under the coordination of the cloud. Particularly, each EN can fetch content generated by sensors within its coverage, which can be uploaded to the cloud via fronthaul and then be delivered to other ENs beyond the communication range. However, sensing data are usually transient with time whereas frequent cache updates could lead to considerable energy consumption at sensors and fronthaul traffic loads. Therefore, we adopt Age of Information to evaluate data freshness and investigate intelligent caching policies to preserve data freshness while reducing cache update costs. Specifically, we model the cache update problem as a cooperative multi-agent Markov decision process with the goal of minimizing the long-term average weighted cost. To efficiently handle the exponentially large number of actions, we devise a novel reinforcement learning approach, which is a discrete multi-agent variant of soft actor-critic (SAC). Furthermore, we generalize the proposed approach into a decentralized control, where each EN can make decisions based on local observations only. Simulation results demonstrate the superior performance of the proposed SAC-based caching schemes.

中文翻译：

缓存物联网传感的瞬态内容：多代理软演员评论家

物联网中的边缘节点（EN）通常充当缓存传感数据的网关，同时为数据消费者提供访问服务。本文考虑了多个EN在云的协调下缓存传感数据。特别是，每个EN都可以获取其覆盖范围内的传感器生成的内容，这些内容可以通过前传上传到云端，然后传递到通信范围之外的其他EN。然而，传感数据通常随着时间的推移是瞬态的，而频繁的缓存更新可能会导致传感器和前传流量负载消耗大量能量。因此，我们采用信息时代来评估数据新鲜度，并研究智能缓存策略以保持数据新鲜度，同时降低缓存更新成本。具体来说，我们将缓存更新问题建模为协作多智能体马尔可夫决策过程，其目标是最小化长期平均加权成本。为了有效地处理呈指数级增长的大量动作，我们设计了一种新颖的强化学习方法，它是软演员评论家（SAC）的离散多智能体变体。此外，我们将所提出的方法概括为分散控制，其中每个 EN 只能根据本地观察做出决策。仿真结果证明了所提出的基于 SAC 的缓存方案的优越性能。

更新日期：2021-06-04

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11