Decentralized online convex optimization based on signs of relative states,Automatica

当前位置： X-MOL 学术 › Automatica › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Decentralized online convex optimization based on signs of relative states
Automatica ( IF 6.4 ) Pub Date : 2021-05-05 , DOI: 10.1016/j.automatica.2021.109676
Xuanyu Cao , Tamer Başar

In this paper, we study a class of decentralized online convex optimization problems with time-varying loss functions over multi-agent networks. We propose a decentralized online subgradient method by using only the signs of the relative states of neighbors, which considerably reduces the sensing and communication requirements for the agents. We show that, despite the loss of information, the proposed algorithm can still achieve $O (\sqrt{T})$ regret bound ( $T$ is the time horizon), which matches that of the standard distributed online subgradient method with exact relative state information. We further investigate the scenario of using noisy signs, where the measurements of the directions of the relative states are perturbed by noise. We show that the regret bound is not affected as long as the noise is not too large, which manifests certain noise-tolerance property of the proposed algorithm. Additionally, we extend the algorithm to the case of bandit feedback, where only the values of the local loss functions at two points are revealed to agents at each time. We demonstrate that the regret bound is not influenced by bandit feedback in order sense. Finally, numerical experiments on decentralized online least squares and logistic regression are conducted to corroborate the efficacy of the proposed algorithms.

中文翻译：

基于相对状态符号的分散式在线凸优化

在本文中，我们研究了多智能体网络上一类具有时变损失函数的分散型在线凸优化问题。我们仅通过使用邻居的相对状态的符号来提出一种分散的在线子梯度方法，这大大降低了对代理的感知和通信需求。我们表明，尽管信息丢失，但所提出的算法仍可以实现 $Ø （ \sqrt{Ť} ）$ 后悔 $Ť$ 是时间范围），它与标准的分布式在线次梯度方法的方法和精确的相对状态信息相匹配。我们将进一步研究使用噪声符号的情况，其中相对状态方向的测量值会受到噪声的干扰。我们表明，只要噪声不太大，后悔界限就不会受到影响，这表明了所提算法的一定的噪声容忍特性。此外，我们将算法扩展到了强盗反馈的情况，在这种情况下，每次仅向代理显示两个点的局部损失函数的值。我们证明，后悔界限不受秩序意义上的强盗反馈的影响。最后，

更新日期：2021-05-05

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>