当前位置: X-MOL 学术Robot. Comput.-Integr. Manuf. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Hybrid machine learning for human action recognition and prediction in assembly
Robotics and Computer-Integrated Manufacturing ( IF 9.1 ) Pub Date : 2021-05-26 , DOI: 10.1016/j.rcim.2021.102184
Jianjing Zhang , Peng Wang , Robert X. Gao

As one of the critical elements for smart manufacturing, human-robot collaboration (HRC), which refers to goal-oriented joint activities of humans and collaborative robots in a shared workspace, has gained increasing attention in recent years. HRC is envisioned to break the traditional barrier that separates human workers from robots and greatly improve operational flexibility and productivity. To realize HRC, a robot needs to recognize and predict human actions in order to provide assistance in a safe and collaborative manner. This paper presents a hybrid approach to context-aware human action recognition and prediction, based on the integration of a convolutional neural network (CNN) and variable-length Markov modeling (VMM). Specifically, a bi-stream CNN structure parses human and object information embedded in video images as the spatial context for action recognition and collaboration context identification. The dependencies embedded in the action sequences are subsequently analyzed by a VMM, which adaptively determines the optimal number of current and past actions that need to be considered in order to maximize the probability of accurate future action prediction. The effectiveness of the developed method is evaluated experimentally on a testbed which simulates an assembly environment. High accuracy in both action recognition and prediction is demonstrated.



中文翻译:

混合机器学习用于装配中的人类动作识别和预测

作为智能制造的关键要素之一,人机协作(HRC)指的是人类和协作机器人在共享工作空间中的目标导向的联合活动,近年来受到越来越多的关注。可以预见,HRC将打破将人类工人与机器人分隔开的传统障碍,并极大地提高运营灵活性和生产率。为了实现HRC,机器人需要识别并预测人类的动作,以便以安全和协作的方式提供帮助。本文基于卷积神经网络(CNN)和变长马尔可夫建模(VMM)的集成,提出了一种用于情景感知的人类动作识别和预测的混合方法。具体来说,双流CNN结构将嵌入在视频图像中的人和对象信息解析为用于操作识别和协作上下文标识的空间上下文。随后由VMM分析嵌入在动作序列中的依存关系,该VMM自适应地确定需要考虑的当前和过去动作的最佳数量,以使准确的未来动作预测的可能性最大化。在模拟装配环境的试验台上通过实验评估了所开发方法的有效性。动作识别和预测的准确性都很高。自适应地确定需要考虑的当前和过去动作的最佳数量,以使准确的未来动作预测的概率最大化。在模拟装配环境的试验台上通过实验评估了所开发方法的有效性。动作识别和预测的准确性都很高。自适应地确定需要考虑的当前和过去动作的最佳数量,以使准确的未来动作预测的概率最大化。在模拟装配环境的试验台上通过实验评估了所开发方法的有效性。动作识别和预测的准确性都很高。

更新日期:2021-05-27
down
wechat
bug