当前位置: X-MOL 学术Quantum › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Quantum agents in the Gym: a variational quantum algorithm for deep Q-learning
Quantum ( IF 5.1 ) Pub Date : 2022-05-24 , DOI: 10.22331/q-2022-05-24-720
Andrea Skolik 1, 2 , Sofiene Jerbi 3 , Vedran Dunjko 1
Affiliation  

Quantum machine learning (QML) has been identified as one of the key fields that could reap advantages from near-term quantum devices, next to optimization and quantum chemistry. Research in this area has focused primarily on variational quantum algorithms (VQAs), and several proposals to enhance supervised, unsupervised and reinforcement learning (RL) algorithms with VQAs have been put forward. Out of the three, RL is the least studied and it is still an open question whether VQAs can be competitive with state-of-the-art classical algorithms based on neural networks (NNs) even on simple benchmark tasks. In this work, we introduce a training method for parametrized quantum circuits (PQCs) that can be used to solve RL tasks for discrete and continuous state spaces based on the deep Q-learning algorithm. We investigate which architectural choices for quantum Q-learning agents are most important for successfully solving certain types of environments by performing ablation studies for a number of different data encoding and readout strategies. We provide insight into why the performance of a VQA-based Q-learning algorithm crucially depends on the observables of the quantum model and show how to choose suitable observables based on the learning task at hand. To compare our model against the classical DQN algorithm, we perform an extensive hyperparameter search of PQCs and NNs with varying numbers of parameters. We confirm that similar to results in classical literature, the architectural choices and hyperparameters contribute more to the agents' success in a RL setting than the number of parameters used in the model. Finally, we show when recent separation results between classical and quantum agents for policy gradient RL can be extended to inferring optimal Q-values in restricted families of environments.

中文翻译:

健身房中的量子代理:用于深度 Q 学习的变分量子算法

量子机器学习 (QML) 已被确定为可以从近期量子设备中获得优势的关键领域之一,仅次于优化和量子化学。该领域的研究主要集中在变分量子算法 (VQAs) 上,并且已经提出了一些使用 VQAs 来增强监督、无监督和强化学习 (RL) 算法的建议。在这三者中,RL 是研究最少的,即使在简单的基准测试任务上,VQA 是否可以与基于神经网络 (NN) 的最先进的经典算法竞争仍然是一个悬而未决的问题。在这项工作中,我们介绍了一种参数化量子电路 (PQC) 的训练方法,该方法可用于基于深度 Q 学习算法解决离散和连续状态空间的 RL 任务。我们通过对许多不同的数据编码和读出策略进行消融研究,研究了量子 Q 学习代理的哪些架构选择对于成功解决某些类型的环境最重要。我们深入了解了为什么基于 VQA 的 Q 学习算法的性能关键取决于量子模型的可观察量,并展示了如何根据手头的学习任务选择合适的可观察量。为了将我们的模型与经典的 DQN 算法进行比较,我们对具有不同数量参数的 PQC 和 NN 进行了广泛的超参数搜索。我们确认,与经典文献中的结果类似,架构选择和超参数对代理在 RL 设置中的成功的贡献大于模型中使用的参数数量。最后,
更新日期:2022-05-24
down
wechat
bug