当前位置: X-MOL 学术Rob. Auton. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Real-time deep learning approach to visual servo control and grasp detection for autonomous robotic manipulation
Robotics and Autonomous Systems ( IF 4.3 ) Pub Date : 2021-02-24 , DOI: 10.1016/j.robot.2021.103757
Eduardo Godinho Ribeiro , Raul de Queiroz Mendes , Valdir Grassi

Robots still cannot perform everyday manipulation tasks, such as grasping, with the same dexterity as humans do. In order to explore the potential of supervised deep learning for robotic grasping in unstructured and dynamic environments, this work addresses the visual perception phase involved in the task. This phase involves the processing of visual data to obtain the location of the object to be grasped, its pose and the points at which the robot’s grippers must make contact to ensure a stable grasp. For this, the Cornell Grasping Dataset (CGD) is used to train a Convolutional Neural Network (CNN) that is able to consider these three stages simultaneously. In other words, having an image of the robot’s workspace, containing a certain object, the network predicts a grasp rectangle that symbolizes the position, orientation and opening of the robot’s parallel grippers the instant before its closing. In addition to this network, which runs in real-time, another network is designed, so that it is possible to deal with situations in which the object moves in the environment. Therefore, the second convolutional network is trained to perform a visual servo control, ensuring that the object remains in the robot’s field of view. This network predicts the proportional values of the linear and angular velocities that the camera must have to ensure the object is in the image processed by the grasp network. The dataset used for training was automatically generated by a Kinova Gen3 robotic manipulator with seven Degrees of Freedom (DoF). The robot is also used to evaluate the applicability in real-time and obtain practical results from the designed algorithms. Moreover, the offline results obtained through test sets are also analyzed and discussed regarding their efficiency and processing speed. The developed controller is able to achieve a millimeter accuracy in the final position considering a target object seen for the first time. To the best of our knowledge, we have not found in the literature other works that achieve such precision with a controller learned from scratch. Thus, this work presents a new system for autonomous robotic manipulation, with the ability to generalize to different objects and with high processing speed, which allows its application in real robotic systems.



中文翻译:

实时深度学习方法,用于视觉伺服控制和自主机器人操纵的抓握检测

机器人仍然无法像人类一样灵活地执行日常操作任务,例如抓握。为了探索有监督的深度学习在非结构化和动态环境中进行机器人抓取的潜力,该工作解决了任务中涉及的视觉感知阶段。此阶段涉及视觉数据的处理,以获得要抓取的物体的位置,其姿势以及机器人的抓手必须接触以确保稳定抓握的点。为此,康奈尔抓取数据集(CGD)用于训练能够同时考虑这三个阶段的卷积神经网络(CNN)。换句话说,有了包含特定对象的机器人工作空间的图像,网络就可以预测代表位置的抓取矩形,机器人平行抓取器的位置和打开之前的瞬间。除了该实时运行的网络之外,还设计了另一个网络,以便可以处理对象在环境中移动的情况。因此,第二个卷积网络经过训练可以执行视觉伺服控制,从而确保对象保留在机器人的视野中。该网络可预测摄像机必须具有的线性和角速度的比例值,以确保物体位于由抓握网络处理的图像中。用于训练的数据集由具有七个自由度(DoF)的Kinova Gen3机械手自动生成。该机器人还用于实时评估适用性,并从设计的算法中获得实际结果。此外,还将分析和讨论通过测试集获得的脱机结果的效率和处理速度。考虑到首次看到的目标物体,开发的控制器能够在最终位置实现毫米精度。据我们所知,我们没有在文献中找到其他通过从头开始学习的控制器达到这种精度的作品。因此,这项工作提出了一种用于自主机器人操纵的新系统,该系统具有泛化到不同对象并具有高处理速度的能力,从而使其可以在实际机器人系统中应用。考虑到首次看到的目标物体,开发的控制器能够在最终位置实现毫米精度。据我们所知,我们没有在文献中找到其他通过从头开始学习的控制器达到这种精度的作品。因此,这项工作提出了一种用于自主机器人操纵的新系统,该系统具有泛化到不同对象并具有高处理速度的能力,从而使其可以在实际机器人系统中应用。考虑到首次看到的目标物体,开发的控制器能够在最终位置实现毫米精度。据我们所知,我们没有在文献中找到其他通过从头开始学习的控制器达到这种精度的作品。因此,这项工作提出了一种用于自主机器人操纵的新系统,该系统具有泛化到不同对象并具有高处理速度的能力,从而使其可以在实际机器人系统中应用。

更新日期:2021-02-24
down
wechat
bug