当前位置: X-MOL 学术Int. J. Artif. Intell. Tools › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
MABWISER: Parallelizable Contextual Multi-armed Bandits
International Journal on Artificial Intelligence Tools ( IF 1.1 ) Pub Date : 2021-06-30 , DOI: 10.1142/s0218213021500214
Emily Strong 1 , Bernard Kleynhans 1 , Serdar Kadıoğlu 1
Affiliation  

Contextual multi-armed bandit algorithms are an effective approach for online sequential decision-making problems. However, there are limited tools available to support their adoption in the community. To fill this gap, we present an open-source Python library with context-free, parametric and non-parametric contextual multi-armed bandit algorithms. The MABWiser library is designed to be user-friendly and supports custom bandit algorithms for specific applications. Our design provides built-in parallelization to speed up training and testing for scalability with special attention given to ensuring the reproducibility of results. The API makes hybrid strategies possible that combine non-parametric policies with parametric ones, an area that is not explored in the literature. As a practical application, we demonstrate using the library in both batch and online simulations for context-free, parametric and non-parametric contextual policies with the well-known MovieLens data set. Finally, we quantify the performance benefits of built-in parallelization.

中文翻译:

MABWISER:可并行化的上下文多臂强盗

上下文多臂老虎机算法是在线顺序决策问题的有效方法。但是,可用于支持其在社区中采用的工具有限。为了填补这一空白,我们提出了一个开源 Python 库,其中包含无上下文、参数和非参数上下文多臂老虎机算法。MABWiser 库设计为用户友好型,并支持针对特定应用程序的自定义强盗算法。我们的设计提供了内置的并行化,以加快可扩展性的训练和测试,并特别注意确保结果的可重复性。API 使将非参数策略与参数策略相结合的混合策略成为可能,这是文献中未探讨的领域。作为一个实际应用,我们使用著名的 MovieLens 数据集演示了在批处理和在线模拟中使用该库的无上下文、参数和非参数上下文策略。最后,我们量化了内置并行化的性能优势。
更新日期:2021-06-30
down
wechat
bug