Online Learning via Offline Greedy Algorithms: Applications in Market Design and Optimization,arXiv - CS - Computer Science and Game Theory

当前位置： X-MOL 学术 › arXiv.cs.GT › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Online Learning via Offline Greedy Algorithms: Applications in Market Design and Optimization
arXiv - CS - Computer Science and Game Theory Pub Date : 2021-02-18 , DOI: arxiv-2102.11050
Rad NiazadehChicago Booth School of Business, Operations Management, Negin GolrezaeiMIT Sloan School of Management, Operations Management, Joshua WangGoogle Research Mountain View, Fransisca SusanMIT Sloan School of Management, Operations Management, Ashwinkumar BadanidiyuruGoogle Research Mountain View

Motivated by online decision-making in time-varying combinatorial environments, we study the problem of transforming offline algorithms to their online counterparts. We focus on offline combinatorial problems that are amenable to a constant factor approximation using a greedy algorithm that is robust to local errors. For such problems, we provide a general framework that efficiently transforms offline robust greedy algorithms to online ones using Blackwell approachability. We show that the resulting online algorithms have $O(\sqrt{T})$ (approximate) regret under the full information setting. We further introduce a bandit extension of Blackwell approachability that we call Bandit Blackwell approachability. We leverage this notion to transform greedy robust offline algorithms into a $O(T^{2/3})$ (approximate) regret in the bandit setting. Demonstrating the flexibility of our framework, we apply our offline-to-online transformation to several problems at the intersection of revenue management, market design, and online optimization, including product ranking optimization in online platforms, reserve price optimization in auctions, and submodular maximization. We show that our transformation, when applied to these applications, leads to new regret bounds or improves the current known bounds.

中文翻译：

通过离线贪婪算法在线学习：在市场设计和优化中的应用

受时变组合环境中在线决策的影响，我们研究了将离线算法转换为在线算法的问题。我们专注于使用对局部错误具有鲁棒性的贪婪算法，以适合于常数因子近似的离线组合问题。对于此类问题，我们提供了一个通用框架，该框架可使用Blackwell易接近性将离线鲁棒贪婪算法有效地转换为在线贪婪算法。我们显示，在完整的信息设置下，所得的在线算法对$ O（\ sqrt {T}）$（近似）感到遗憾。我们进一步介绍了Blackwell平易性的强盗扩展，我们称之为Bandit Blackwell平易性。我们利用此概念将贪婪的鲁棒离线算法转换为在强盗情况下的$ O（T ^ {2/3}）$（近似）遗憾。为了展示我们框架的灵活性，我们将离线到在线转换应用于收益管理，市场设计和在线优化的交叉点上的几个问题，包括在线平台上的产品排名优化，拍卖中的底价优化和次模块最大化。。我们表明，将变换应用于这些应用程序后，会导致新的后悔界限或改善当前的已知界限。

更新日期：2021-02-23

点击分享查看原文

点击收藏

阅读更多本刊最新论文