当前位置: X-MOL 学术arXiv.cs.LG › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Improving Computational Efficiency in Visual Reinforcement Learning via Stored Embeddings
arXiv - CS - Machine Learning Pub Date : 2021-03-04 , DOI: arxiv-2103.02886
Lili Chen, Kimin Lee, Aravind Srinivas, Pieter Abbeel

Recent advances in off-policy deep reinforcement learning (RL) have led to impressive success in complex tasks from visual observations. Experience replay improves sample-efficiency by reusing experiences from the past, and convolutional neural networks (CNNs) process high-dimensional inputs effectively. However, such techniques demand high memory and computational bandwidth. In this paper, we present Stored Embeddings for Efficient Reinforcement Learning (SEER), a simple modification of existing off-policy RL methods, to address these computational and memory requirements. To reduce the computational overhead of gradient updates in CNNs, we freeze the lower layers of CNN encoders early in training due to early convergence of their parameters. Additionally, we reduce memory requirements by storing the low-dimensional latent vectors for experience replay instead of high-dimensional images, enabling an adaptive increase in the replay buffer capacity, a useful technique in constrained-memory settings. In our experiments, we show that SEER does not degrade the performance of RL agents while significantly saving computation and memory across a diverse set of DeepMind Control environments and Atari games. Finally, we show that SEER is useful for computation-efficient transfer learning in RL because lower layers of CNNs extract generalizable features, which can be used for different tasks and domains.

中文翻译:

通过存储的嵌入提高视觉增强学习中的计算效率

非政策性深度强化学习(RL)的最新进展已使目视观察在复杂任务中取得了令人瞩目的成功。经验重播通过重用过去的经验来提高样本效率,而卷积神经网络(CNN)可有效处理高维输入。但是,这样的技术需要高存储和计算带宽。在本文中,我们提出了用于有效强化学习(SEER)的存储嵌入,这是对现有非策略RL方法的简单修改,以解决这些计算和内存需求。为了减少CNN中梯度更新的计算开销,由于其参数的早期收敛,我们在训练的早期冻结了CNN编码器的较低层。此外,我们通过存储用于体验重播的低维潜在矢量而不是高维图像来减少内存需求,从而可以自适应地增加重播缓冲区的容量,这是在受限内存设置中的一种有用技术。在我们的实验中,我们表明SEER不会降低RL代理的性能,同时可以显着节省各种DeepMind Control环境和Atari游戏中的计算和内存。最后,我们证明SEER对于RL中的计算有效转移学习很有用,因为CNN的较低层提取了可用于不同任务和领域的通用特征。我们证明SEER不会降低RL代理的性能,同时可以大大节省各种DeepMind Control环境和Atari游戏中的计算和内存。最后,我们证明SEER对于RL中的计算有效转移学习很有用,因为CNN的较低层提取了可用于不同任务和领域的通用特征。我们证明SEER不会降低RL代理的性能,同时可以大大节省各种DeepMind Control环境和Atari游戏中的计算和内存。最后,我们证明SEER对于RL中的计算有效转移学习很有用,因为CNN的较低层提取了可用于不同任务和领域的通用特征。
更新日期:2021-03-05
down
wechat
bug