当前位置: X-MOL 学术Appl. Soft Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Robot learning through observation via coarse-to-fine grained video summarization
Applied Soft Computing ( IF 8.7 ) Pub Date : 2020-11-19 , DOI: 10.1016/j.asoc.2020.106913
Yujia Zhang , Qianzhong Li , Xiaoguang Zhao , Min Tan

Learning human daily behavior is important for enabling robots to perform tasks and assist people. However, most prior work either requires specific sensors for capturing data or heavily relies on prior knowledge of human motion, which can be difficult to obtain. To alleviate the above problems, we propose a novel pipeline for robots to learn human behavior based on coarse-to-fine video summarization using a single Kinect camera. Specifically, the robot first retrieves information of general interest followed by a task-specific content retrieval, then focuses on fine-grained motion clips of human behavior, and guides itself by using an object-centric learning method to complete the desired task. Our work has three unique advantages: (1) it enables the robot to effectively capture granularity hierarchies of human behavior which efficiently exploits multi-stage information while alleviating disturbances and redundancies in visual data; (2) it obtains knowledge by focusing on object movements in summarized motion clips which does not require any prior knowledge of human motion; (3) it only requires a single Kinect sensor for the robot to learn human behavior which is fully accessible and easy to equip. Experiments in an office environment were performed to validate the efficiency and effectiveness of the proposed framework, and the results indicate that this approach exhibits good learning efficacy for the robot to understand human behavior and learn to perform tasks.



中文翻译:

通过观察从粗粒度到细粒度的视频摘要进行机器人学习

学习人类的日常行为对于使机器人能够执行任务并为人们提供帮助非常重要。但是,大多数先前的工作要么需要特定的传感器来捕获数据,要么在很大程度上依赖于人类运动的先验知识,而这可能很难获得。为了缓解上述问题,我们提出了一种新颖的流水线,供机器人使用单个Kinect摄像机基于从粗到精的视频摘要来学习人类行为。具体来说,机器人首先获取普遍感兴趣的信息,然后再获取特定于任务的内容,然后专注于人类行为的细粒度运动剪辑,并通过使用以对象为中心的学习方法完成所需任务来进行自我指导。我们的工作具有三个独特的优势:(1)它使机器人能够有效捕获人类行为的粒度层次结构,从而有效地利用多阶段信息,同时减轻视觉数据中的干扰和冗余;(2)通过集中于摘要运动片段中的对象运动来获得知识,不需要任何人类运动的先验知识;(3)机器人只需要一个Kinect传感器即可学习人类行为,该传感器完全可访问且易于装备。在办公环境中进行了实验,以验证所提出框架的效率和有效性,结果表明,该方法对于机器人理解人类行为和执行任务具有良好的学习效果。(2)通过集中于摘要运动片段中的对象运动来获得知识,不需要任何人类运动的先验知识;(3)机器人只需要一个Kinect传感器即可学习人类行为,该传感器完全可访问且易于装备。在办公环境中进行了实验,以验证所提出框架的效率和有效性,结果表明,该方法对于机器人理解人类行为和执行任务具有良好的学习效果。(2)通过集中于摘要运动片段中的对象运动来获得知识,不需要任何人类运动的先验知识;(3)机器人只需要一个Kinect传感器即可学习人类行为,该传感器完全可访问且易于装备。在办公环境中进行了实验,以验证所提出框架的效率和有效性,结果表明,该方法对于机器人理解人类行为和执行任务具有良好的学习效果。

更新日期:2020-11-19
down
wechat
bug