当前位置: X-MOL 学术Neural Comput. & Applic. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Unreal mask: one-shot multi-object class-based pose estimation for robotic manipulation using keypoints with a synthetic dataset
Neural Computing and Applications ( IF 6 ) Pub Date : 2021-04-08 , DOI: 10.1007/s00521-020-05644-6
S. H. Zabihifar , A. N. Semochkin , E. V. Seliverstova , A. R. Efimov

Object pose estimation is a prerequisite for many robotic applications. Preparing dataset for network training is a challenging part of the pose estimation approaches, and in most of them, the network can detect just the trained objects. Synthetic data are used to train deep neural networks in robotic manipulation as a promising method for obtaining a huge amount of prelabeled training data, which are generated safely. We are to investigate the reality gap in the pose estimation of intra-category objects from a single RGB-D image using keypoints. The proposed approach in this paper provides a fast and simple procedure for training a deep neural network to identify the object and its keypoints based on synthetic dataset and autolabeling program. To our knowledge, this is the first deep network trained only on synthetic data that can find keypoints of intra-category objects for pose estimation purposes. The speed of training and the simplicity of this method make it very easy to add a new class of objects to the system which is the main advantage of this approach. Using this approach, we demonstrate a near-real-time system estimating object poses with sufficient accuracy for real-world semantic grasping and manipulating of intra-category objects in clutter by a real robot.



中文翻译:

虚幻蒙版:使用带有合成数据集的关键点进行机器人操纵的单次基于多对象类的姿势估计

对象姿态估计是许多机器人应用的先决条件。准备数据集进行网络训练是姿势估计方法中具有挑战性的一部分,并且在大多数方法中,网络只能检测到训练后的对象。合成数据用于在机器人操纵中训练深层神经网络,这是一种有前途的方法,可用于获取大量预先标记的训练数据,这些数据可以安全地生成。我们将使用关键点从单个RGB-D图像中调查类别内对象的姿态估计中的现实差距。本文提出的方法为基于合成数据集和自动标记程序的深度神经网络识别对象及其关键点的训练提供了快速简单的过程。据我们所知,这是第一个仅在合成数据上经过训练的深度网络,该网络可以找到类别内对象的关键点以进行姿势估计。训练的速度和这种方法的简单性使得向系统添加新的对象类别非常容易,这是该方法的主要优点。使用这种方法,我们展示了一种接近实时的系统,该系统以足够的精度估算对象的姿态,以实现真实机器人对杂乱类别中的内部对象的真实世界的语义掌握和操纵。

更新日期:2021-04-09
down
wechat
bug