当前位置: X-MOL 学术Mach. Learn. › 论文详情
Asymptotically optimal algorithms for budgeted multiple play bandits
Machine Learning ( IF 2.809 ) Pub Date : 2019-05-16 , DOI: 10.1007/s10994-019-05799-x
Alex Luedtke, Emilie Kaufmann, Antoine Chambaz

Abstract We study a generalization of the multi-armed bandit problem with multiple plays where there is a cost associated with pulling each arm and the agent has a budget at each time that dictates how much she can expect to spend. We derive an asymptotic regret lower bound for any uniformly efficient algorithm in our setting. We then study a variant of Thompson sampling for Bernoulli rewards and a variant of KL-UCB for both single-parameter exponential families and bounded, finitely supported rewards. We show these algorithms are asymptotically optimal, both in rate and leading problem-dependent constants, including in the thick margin setting where multiple arms fall on the decision boundary.
更新日期:2020-01-04

 

全部期刊列表>>
智控未来
聚焦商业经济政治法律
跟Nature、Science文章学绘图
控制与机器人
招募海内外科研人才,上自然官网
隐藏1h前已浏览文章
课题组网站
新版X-MOL期刊搜索和高级搜索功能介绍
ACS材料视界
x-mol收录
湖南大学化学化工学院刘松
上海有机所
李旸
南方科技大学
西湖大学
X-MOL
支志明
中山大学化学工程与技术学院
试剂库存
天合科研
down
wechat
bug