当前位置: X-MOL 学术arXiv.cs.AI › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
GELATO: Geometrically Enriched Latent Model for Offline Reinforcement Learning
arXiv - CS - Artificial Intelligence Pub Date : 2021-02-22 , DOI: arxiv-2102.11327
Guy Tennenholtz, Nir Baram, Shie Mannor

Offline reinforcement learning approaches can generally be divided to proximal and uncertainty-aware methods. In this work, we demonstrate the benefit of combining the two in a latent variational model. We impose a latent representation of states and actions and leverage its intrinsic Riemannian geometry to measure distance of latent samples to the data. Our proposed metrics measure both the quality of out of distribution samples as well as the discrepancy of examples in the data. We integrate our metrics in a model-based offline optimization framework, in which proximity and uncertainty can be carefully controlled. We illustrate the geodesics on a simple grid-like environment, depicting its natural inherent topology. Finally, we analyze our approach and improve upon contemporary offline RL benchmarks.

中文翻译:

GELATO:用于离线强化学习的几何丰富的潜在模型

离线强化学习方法通​​常可分为近端方法和不确定性方法。在这项工作中,我们展示了在潜在的变异模型中结合两者的好处。我们强加了状态和动作的潜在表示,并利用其固有的黎曼几何来测量潜在样本到数据的距离。我们提出的指标既可以测量分布样本之外的质量,也可以测量数据中示例的差异。我们将指标集成到基于模型的离线优化框架中,在该框架中可以仔细控制邻近度和不确定性。我们在一个简单的网格状环境中说明了测地线,并描述了其自然的固有拓扑。最后,我们分析我们的方法并改进当代离线RL基准。
更新日期:2021-02-24
down
wechat
bug