Combining Reinforcement Learning and Tensor Networks, with an Application to Dynamical Large Deviations,Physical Review Letters

当前位置： X-MOL 学术 › Phys. Rev. Lett. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Combining Reinforcement Learning and Tensor Networks, with an Application to Dynamical Large Deviations
Physical Review Letters ( IF 8.6 ) Pub Date : 2024-05-07 , DOI: 10.1103/physrevlett.132.197301
Edward Gillman , Dominic C. Rose , Juan P. Garrahan

We present a framework to integrate tensor network (TN) methods with reinforcement learning (RL) for solving dynamical optimization tasks. We consider the RL actor-critic method, a model-free approach for solving RL problems, and introduce TNs as the approximators for its policy and value functions. Our “actor-critic with tensor networks” (ACTeN) method is especially well suited to problems with large and factorizable state and action spaces. As an illustration of the applicability of ACTeN we solve the exponentially hard task of sampling rare trajectories in two paradigmatic stochastic models, the East model of glasses and the asymmetric simple exclusion process, the latter being particularly challenging to other methods due to the absence of detailed balance. With substantial potential for further integration with the vast array of existing RL methods, the approach introduced here is promising both for applications in physics and to multi-agent RL problems more generally.

中文翻译：

结合强化学习和张量网络，应用于动态大偏差

我们提出了一个将张量网络（TN）方法与强化学习（RL）相结合的框架，用于解决动态优化任务。我们考虑 RL actor-critic 方法，这是一种解决 RL 问题的无模型方法，并引入 TN 作为其策略和价值函数的逼近器。我们的“张量网络演员批评家”（ACTeN）方法特别适合解决具有大型且可分解的状态和动作空间的问题。作为 ACTeN 的适用性的说明，我们解决了在两个范式随机模型（眼镜的 East 模型和非对称简单排除过程）中对稀有轨迹进行采样的指数级艰巨任务，后者由于缺乏详细的信息而对其他方法特别具有挑战性。平衡。这里介绍的方法具有与大量现有 RL 方法进一步集成的巨大潜力，对于物理应用和更广泛的多智能体 RL 问题都有希望。

更新日期：2024-05-08

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>