当前位置: X-MOL 学术J. ACM › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Kernel-based Methods for Bandit Convex Optimization
Journal of the ACM ( IF 2.5 ) Pub Date : 2021-06-30 , DOI: 10.1145/3453721
Sébastien Bubeck 1 , Ronen Eldan 2 , Yin Tat Lee 3
Affiliation  

We consider the adversarial convex bandit problem and we build the first poly( T )-time algorithm with poly( n ) √ T -regret for this problem. To do so, we introduce three new ideas in the derivative-free optimization literature: (i) kernel methods, (ii) a generalization of Bernoulli convolutions, and (iii) a new annealing schedule for exponential weights (with increasing learning rate). The basic version of our algorithm achieves Õ( n 9.5T )-regret, and we show that a simple variant of this algorithm can be run in poly( n log ( T ))-time per step (for polytopes with polynomially many constraints) at the cost of an additional poly( n ) T o(1) factor in the regret. These results improve upon the Õ( n 11T -regret and exp (poly( T ))-time result of the first two authors and the log ( T ) poly( n ) T -regret and log( T ) poly( n ) -time result of Hazan and Li. Furthermore, we conjecture that another variant of the algorithm could achieve Õ( n 1.5T )-regret, and moreover that this regret is unimprovable (the current best lower bound being Ω ( nT ) and it is achieved with linear functions). For the simpler situation of zeroth order stochastic convex optimization this corresponds to the conjecture that the optimal query complexity is of order n 3 / ɛ 2 .

中文翻译:

基于核的强盗凸优化方法

我们考虑对抗性凸老虎机问题,并构建第一个 poly()-时间算法与 poly(n) √- 后悔这个问题。为此,我们在无导数优化文献中引入了三个新思想:(i)核方法,(ii)伯努利卷积的泛化,以及(iii)指数权重的新退火计划(随着学习率的增加)。我们算法的基本版本实现了Õ(n 9.5)-遗憾,我们证明了这个算法的一个简单变体可以在 poly(n日志 ())-每一步的时间(对于具有多项式约束的多面体)以额外的 poly(n) o(1)后悔的因素。这些结果在 Õ(n 11-遗憾和 exp(聚())-前两位作者的时间结果和日志()聚(n)-遗憾并记录()聚(n)- Hazan 和 Li 的时间结果。此外,我们推测该算法的另一种变体可以实现 Õ(n 1.5)-遗憾,而且这种遗憾是无法改善的(当前最佳下限为 Ω (n),它是用线性函数实现的)。对于零阶随机凸优化的更简单情况,这对应于最优查询复杂度是有序的猜想n 3/ ɛ2.
更新日期:2021-06-30
down
wechat
bug