Federated Multi-armed Bandits with Personalization,arXiv - CS - Information Theory

当前位置： X-MOL 学术 › arXiv.cs.IT › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Federated Multi-armed Bandits with Personalization
arXiv - CS - Information Theory Pub Date : 2021-02-25 , DOI: arxiv-2102.13101
Chengshuai Shi, Cong Shen, Jing Yang

A general framework of personalized federated multi-armed bandits (PF-MAB) is proposed, which is a new bandit paradigm analogous to the federated learning (FL) framework in supervised learning and enjoys the features of FL with personalization. Under the PF-MAB framework, a mixed bandit learning problem that flexibly balances generalization and personalization is studied. A lower bound analysis for the mixed model is presented. We then propose the Personalized Federated Upper Confidence Bound (PF-UCB) algorithm, where the exploration length is chosen carefully to achieve the desired balance of learning the local model and supplying global information for the mixed learning objective. Theoretical analysis proves that PF-UCB achieves an $O(\log(T))$ regret regardless of the degree of personalization, and has a similar instance dependency as the lower bound. Experiments using both synthetic and real-world datasets corroborate the theoretical analysis and demonstrate the effectiveness of the proposed algorithm.

中文翻译：

个性化的联合多臂匪

提出了一种个性化的联合多武装匪（PF-MAB）的通用框架，它是一种类似于监督学习中的联合学习（FL）框架的新的匪徒范式，并具有个性化的FL特点。在PF-MAB框架下，研究了一种混合匪徒学习问题，该问题灵活地平衡了泛化和个性化。提出了混合模型的下界分析。然后，我们提出了个性化联合上限可信度（PF-UCB）算法，其中，仔细选择探索长度以实现学习局部模型和为混合学习目标提供全局信息所需的平衡。理论分析证明，无论个性化程度如何，PF-UCB都会产生$ O（\ log（T））$的遗憾，并具有与下限相似的实例依存关系。使用综合和真实数据集进行的实验证实了理论分析，并证明了所提出算法的有效性。

更新日期：2021-02-26

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>