当前位置:
X-MOL 学术
›
arXiv.cs.RO
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Triple-GAIL: A Multi-Modal Imitation Learning Framework with Generative Adversarial Nets
arXiv - CS - Robotics Pub Date : 2020-05-19 , DOI: arxiv-2005.10622 Cong Fei, Bin Wang, Yuzheng Zhuang, Zongzhang Zhang, Jianye Hao, Hongbo Zhang, Xuewu Ji and Wulong Liu
arXiv - CS - Robotics Pub Date : 2020-05-19 , DOI: arxiv-2005.10622 Cong Fei, Bin Wang, Yuzheng Zhuang, Zongzhang Zhang, Jianye Hao, Hongbo Zhang, Xuewu Ji and Wulong Liu
Generative adversarial imitation learning (GAIL) has shown promising results
by taking advantage of generative adversarial nets, especially in the field of
robot learning. However, the requirement of isolated single modal
demonstrations limits the scalability of the approach to real world scenarios
such as autonomous vehicles' demand for a proper understanding of human
drivers' behavior. In this paper, we propose a novel multi-modal GAIL
framework, named Triple-GAIL, that is able to learn skill selection and
imitation jointly from both expert demonstrations and continuously generated
experiences with data augmentation purpose by introducing an auxiliary skill
selector. We provide theoretical guarantees on the convergence to optima for
both of the generator and the selector respectively. Experiments on real driver
trajectories and real-time strategy game datasets demonstrate that Triple-GAIL
can better fit multi-modal behaviors close to the demonstrators and outperforms
state-of-the-art methods.
中文翻译:
Triple-GAIL:具有生成对抗网络的多模态模仿学习框架
生成对抗模仿学习(GAIL)通过利用生成对抗网络显示出可喜的结果,特别是在机器人学习领域。然而,隔离单模态演示的要求限制了该方法对现实世界场景的可扩展性,例如自动驾驶汽车对正确理解人类驾驶员行为的需求。在本文中,我们提出了一种新的多模态 GAIL 框架,名为 Triple-GAIL,它能够通过引入辅助技能选择器,从专家演示和持续生成的数据增强经验中联合学习技能选择和模仿。我们分别为生成器和选择器的收敛性提供了理论保证。
更新日期:2020-05-25
中文翻译:
Triple-GAIL:具有生成对抗网络的多模态模仿学习框架
生成对抗模仿学习(GAIL)通过利用生成对抗网络显示出可喜的结果,特别是在机器人学习领域。然而,隔离单模态演示的要求限制了该方法对现实世界场景的可扩展性,例如自动驾驶汽车对正确理解人类驾驶员行为的需求。在本文中,我们提出了一种新的多模态 GAIL 框架,名为 Triple-GAIL,它能够通过引入辅助技能选择器,从专家演示和持续生成的数据增强经验中联合学习技能选择和模仿。我们分别为生成器和选择器的收敛性提供了理论保证。