Good Actors can come in Smaller Sizes: A Case Study on the Value of Actor-Critic Asymmetry,arXiv - CS - Robotics

当前位置： X-MOL 学术 › arXiv.cs.RO › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Good Actors can come in Smaller Sizes: A Case Study on the Value of Actor-Critic Asymmetry
arXiv - CS - Robotics Pub Date : 2021-02-23 , DOI: arxiv-2102.11893
Siddharth Mysore, Bassel Mabsout, Renato Mancuso, Kate Saenko

Actors and critics in actor-critic reinforcement learning algorithms are functionally separate, yet they often use the same network architectures. This case study explores the performance impact of network sizes when considering actor and critic architectures independently. By relaxing the assumption of architectural symmetry, it is often possible for smaller actors to achieve comparable policy performance to their symmetric counterparts. Our experiments show up to 97% reduction in the number of network weights with an average reduction of 64% over multiple algorithms on multiple tasks. Given the practical benefits of reducing actor complexity, we believe configurations of actors and critics are aspects of actor-critic design that deserve to be considered independently.

中文翻译：

好的演员可以以更小的规模出现：以演员-批评性不对称的价值为例

演员批评强化学习算法中的演员和评论家在功能上是分开的，但他们通常使用相同的网络体系结构。本案例研究探讨了在独立考虑参与者和评论者体系结构时网络规模对性能的影响。通过放宽架构对称性的假设，较小的参与者通常有可能实现与对称参与者相当的政策绩效。我们的实验表明，在多个任务上使用多种算法，网络权重数量最多减少97％，平均减少64％。鉴于降低演员复杂性的实际好处，我们认为演员和批评家的配置是演员批评设计的各个方面，应独立考虑。

更新日期：2021-02-25

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>