Is Q-Learning Provably Efficient? An Extended Analysis,arXiv - CS - Machine Learning

当前位置： X-MOL 学术 › arXiv.cs.LG › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Is Q-Learning Provably Efficient? An Extended Analysis
arXiv - CS - Machine Learning Pub Date : 2020-09-22 , DOI: arxiv-2009.10396
Kushagra Rastogi and Jonathan Lee and Fabrice Harel-Canada and Aditya Joglekar

This work extends the analysis of the theoretical results presented within the paper Is Q-Learning Provably Efficient? by Jin et al. We include a survey of related research to contextualize the need for strengthening the theoretical guarantees related to perhaps the most important threads of model-free reinforcement learning. We also expound upon the reasoning used in the proofs to highlight the critical steps leading to the main result showing that Q-learning with UCB exploration achieves a sample efficiency that matches the optimal regret that can be achieved by any model-based approach.

中文翻译：

Q-Learning 是否有效？扩展分析

这项工作扩展了对 Q-Learning Provably Efficient? 论文中提出的理论结果的分析。通过 Jin 等人。我们对相关研究进行了调查，以将加强与无模型强化学习的最重要线程相关的理论保证的必要性联系起来。我们还阐述了证明中使用的推理，以突出导致主要结果的关键步骤，表明 Q-learning 和 UCB 探索实现了与任何基于模型的方法可以实现的最佳遗憾相匹配的样本效率。

更新日期：2020-09-23

点击分享查看原文

点击收藏

阅读更多本刊最新论文