当前位置: X-MOL 学术J. Syst. Archit. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Distribted learning dynamics of Multi-Armed Bandits for edge intelligence
Journal of Systems Architecture ( IF 3.7 ) Pub Date : 2020-10-31 , DOI: 10.1016/j.sysarc.2020.101919
Shuzhen Chen , Youming Tao , Dongxiao Yu , Feng Li , Bei Gong

Multi-agent decision making is a fundamental problem in edge intelligence. In this paper, we study this problem for IoT networks under the distributed Multi-Armed Bandits (MAB) model. Most of existing works for distributed MAB demand long-time stable networks connected by powerful devices and hence may not be suitable for mobile IoT networks with harsh IoT constraints. To meet the challenge of resource constraints in mobile IoT environment, we propose a lightweight and robust learning algorithm in a dynamic network allowing topology changes. In our model, each agent is assumed to have only limited memory and communicate with each other asynchronously. Moreover, we assume that the bandwidth for exchanging information is limited and each agent can transmit O(log2K) bits (K denotes the number of arms) per communication. Rigorous analysis shows that despite these harsh constraints, the best arm/option can be identified collaboratively by the agents and the algorithm converges efficiently. Extensive experiments illustrate that the proposed algorithm exhibits good efficiency and stability in mobile settings.



中文翻译:

用于边缘智能的多用途强盗的分布式学习动态

多主体决策是边缘智能中的一个基本问题。在本文中,我们将在分布式多臂土匪(MAB)模型下研究IoT网络的这一问题。现有的大多数有关分布式MAB的工作都需要通过功能强大的设备连接的长期稳定的网络,因此可能不适合具有严格IoT约束的移动IoT网络。为了应对移动物联网环境中资源约束的挑战,我们在动态网络中提出了一种轻量级且健壮的学习算法,该拓扑允许更改拓扑。在我们的模型中,假定每个代理仅具有有限的内存,并且彼此异步通信。此外,我们假设交换信息的带宽是有限的,并且每个代理可以发送Ø日志2ķ 位(ķ表示每次通信的武器数量。严格的分析表明,尽管存在这些苛刻的约束,但是代理可以协同确定最佳的手臂/选项,并且算法可以有效地收敛。大量实验表明,该算法在移动环境中具有良好的效率和稳定性。

更新日期:2020-11-02
down
wechat
bug