当前位置: X-MOL 学术arXiv.cs.GT › 论文详情
Policy Regret in Repeated Games
arXiv - CS - Computer Science and Game Theory Pub Date : 2018-11-09 , DOI: arxiv-1811.04127
Raman Arora; Michael Dinitz; Teodor V. Marinov; Mehryar Mohri

The notion of \emph{policy regret} in online learning is a well defined? performance measure for the common scenario of adaptive adversaries, which more traditional quantities such as external regret do not take into account. We revisit the notion of policy regret and first show that there are online learning settings in which policy regret and external regret are incompatible: any sequence of play that achieves a favorable regret with respect to one definition must do poorly with respect to the other. We then focus on the game-theoretic setting where the adversary is a self-interested agent. In that setting, we show that external regret and policy regret are not in conflict and, in fact, that a wide class of algorithms can ensure a favorable regret with respect to both definitions, so long as the adversary is also using such an algorithm. We also show that the sequence of play of no-policy regret algorithms converges to a \emph{policy equilibrium}, a new notion of equilibrium that we introduce. Relating this back to external regret, we show that coarse correlated equilibria, which no-external regret players converge to, are a strict subset of policy equilibria. Thus, in game-theoretic settings, every sequence of play with no external regret also admits no policy regret, but the converse does not hold.
更新日期:2020-03-24

 

全部期刊列表>>
聚焦肿瘤,探索癌症
欢迎探索2019年最具下载量的材料科学论文
论文语言润色服务
宅家赢大奖
如何将化学应用到可持续发展目标中
向世界展示您的会议墙报和演示文稿
全球疫情及响应:BMC Medicine专题征稿
新版X-MOL期刊搜索和高级搜索功能介绍
化学材料学全球高引用
ACS材料视界
x-mol收录
自然科研论文编辑服务
南方科技大学
南方科技大学
舒伟
中国科学院长春应化所于聪-4-8
复旦大学
课题组网站
X-MOL
香港大学化学系刘俊治
中山大学化学工程与技术学院
试剂库存
天合科研
down
wechat
bug