Learning Options from Demonstration using Skill Segmentation,arXiv - CS - Machine Learning

当前位置： X-MOL 学术 › arXiv.cs.LG › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Learning Options from Demonstration using Skill Segmentation
arXiv - CS - Machine Learning Pub Date : 2020-01-19 , DOI: arxiv-2001.06793
Matthew Cockcroft, Shahil Mawjee, Steven James, Pravesh Ranchod

We present a method for learning options from segmented demonstration trajectories. The trajectories are first segmented into skills using nonparametric Bayesian clustering and a reward function for each segment is then learned using inverse reinforcement learning. From this, a set of inferred trajectories for the demonstration are generated. Option initiation sets and termination conditions are learned from these trajectories using the one-class support vector machine clustering algorithm. We demonstrate our method in the four rooms domain, where an agent is able to autonomously discover usable options from human demonstration. Our results show that these inferred options can then be used to improve learning and planning.

中文翻译：

使用技能细分从演示中学习选项

我们提出了一种从分段演示轨迹中学习选项的方法。首先使用非参数贝叶斯聚类将轨迹分割成技能，然后使用逆强化学习学习每个部分的奖励函数。由此，生成了一组用于演示的推断轨迹。使用一类支持向量机聚类算法从这些轨迹中学习选项启动集和终止条件。我们在四个房间域中演示了我们的方法，其中代理能够从人类演示中自主发现可用选项。我们的结果表明，这些推断的选项可用于改进学习和规划。

更新日期：2020-01-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>