Abstract
Real-time vision-based robotic grasping is challenging in clutter. In such scene, the target object should be perceived accurately, where it may be occluded and misrecognized by many distractors including irrelevant objects and the robotic arm. In addition, the limited field of view (FOV) of camera makes it prone for objects to get out of the camera view. We develop a novel camera fusion method of pose estimation based on switching scheme for real-time robotic grasping under hybrid eye-in-hand (EIH)/eye-to-hand (ETH) configurations. The objects are locked based on occlusion-aware object detection to apply switching function for single pose estimation or multiple vision fusion. This method improves the accuracy of pose estimation and robustness of dynamic grasping under occlusion. Experimental results on pose estimation and real-time robotic grasping in clutter verify the effectiveness of the proposed method.
Similar content being viewed by others
References
Zeng, A., Song, S., Yu, K. T., Donlon, E., Hogan, F. R., Bauza, M., ... & Fazeli, N: Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching. In 2018 IEEE International Conference on Robotics and Automation (ICRA) (pp. 1–8), 2018
Dong, G., Zhu, Z.H.: Position-based visual servo control of autonomous robotic manipulators. Acta Astronautica. 115, 291–302 (2015)
Caldera, S., Rassau, A., Chai, D.: Review of deep learning methods in robotic grasp detection. Multimodal Technologies and Interaction. 2(3), 57 (2018)
Morrison, D., Corke, P., Leitner, J.: Closing the loop for robotic grasping: a real-time, generative grasp synthesis approach, in. Robotics, Science and Systems (RSS) (2018)
L. Berscheid, T. Rühr, and T. Kröger, Improving data efficiency of self-supervised learning for robotic grasping. International Conference on Robotics and Automation (ICRA). IEEE, 2019: 2125–2131
Pinto, Lerrel, and Abhinav Gupta. Supersizing self-supervision: Learning to grasp from 50k tries and 700 robot hours. 2016 IEEE international conference on robotics and automation (ICRA). IEEE, 2016 : 3406–3413
Lenz, I., Lee, H., Saxena, A.: Deep learning for detecting robotic grasps. The International Journal of Robotics Research. 34(4–5), 705–724 (2015)
Levine, S.: Et al. end-to-end training of deep visuomotor policies. The. J. Mach. Learn. Res. 17(1), 1334–1373 (2016)
Levine, S., Pastor, P., Krizhevsky, A., Ibarz, J., Quillen, D.: Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. The International Journal of Robotics Research. 37(4–5), 421–436 (2018)
Hutchinson, S., Hager, G.D., Corke, P.I.: A tutorial on visual servo control. IEEE Trans. Robot. Autom. 12(5), 651–670 (1996)
Chaumette, F., Hutchinson, S.: Visual servo control. Part I: Basic approaches. IEEE Robot. Automat. Mag. 13, 82–90 (2006)
Corke, P.: MATLAB toolboxes: robotics and vision for students and teachers. IEEE Robotics & automation magazine. 14(4), 16–17 (2007)
Kneip, Laurent, Davide Scaramuzza, and Roland Siegwart. A novel parametrization of the perspective-three-point problem for a direct computation of absolute camera position and orientation. CVPR 2011. IEEE, 2011
Janabi-Sharifi, F., Marey, M.: A kalman-filter-based method for pose estimation in visual servoing. IEEE Trans. Robot. 26(5), 939–947 (2010)
Salehian, M., RayatDoost, S., Taghirad, H.D.: Robust unscented Kalman filter for visual servoing system, control, instrumentation and automation (ICCIA), 2011 2nd international conference on. IEEE. 1006–1011 (2011)
Larouche, B.P., Zhu, Z.H.: Autonomous robotic capture of non-cooperative target using visual servoing and motion predictive control. Auton. Robot. 37(2), 157–167 (2014)
Bloesch, Michael, et al. Robust visual inertial odometry using a direct EKF-based approach. 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, 2015
Nützi, Gabriel, et al. Fusion of IMU and vision for absolute scale estimation in monocular SLAM. Journal of intelligent & robotic systems 61.1–4 (2011): 287–299
Corke, Peter, Jorge Lobo, and Jorge Dias. An introduction to inertial and visual sensing. (2007): 519–535
Qin, T., Li, P., Shen, S.: Vins-mono: a robust and versatile monocular visual-inertial state estimator. IEEE Trans. Robot. 34(4), 1004–1020 (2018)
Qin, Tong, and Shaojie Shen. Online temporal calibration for monocular visual-inertial systems. 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2018
Lippiello, V., Siciliano, B., Villani, L.: Position-based visual servoing in industrial multirobot cells using a hybrid camera configuration. IEEE Trans. Robot. 23(1), 73–86 (2007)
Wang, Y., Zhang, G., Lang, H., Zuo, B., de Silva, C.W.: A modified image-based visual servo controller with hybrid camera configuration for robust robotic grasping. Robot. Auton. Syst. 62(10), 1398–1407 (2014)
Assa, A., Janabi-Sharifi, F.: A robust vision-based sensor fusion approach for real-time pose estimation. IEEE transactions on cybernetics. 44(2), 217–227 (2014)
Assa, A., Janabi-Sharifi, F.: Virtual visual servoing for multicamera pose estimation. IEEE/ASME Transactions on Mechatronics. 20(2), 789–798 (2015)
Redmon, J., Angelova, A.: Real-time grasp detection using convolutional neural networks, robotics and automation (ICRA), 2015 IEEE international conference on. IEEE. 1316–1322 (2015)
Kumra, S., Kanan, C.: Robotic grasp detection using deep convolutional neural networks, intelligent robots and systems (IROS), 2017 IEEE/RSJ international conference on. IEEE. 769–776 (2017)
Liu, W., Pan, Z., Liu, W., et al.: Deep learning for picking point detection in dense clutter, control conference (ASCC), 2017 11th Asian. IEEE. 1644–1649 (2017)
ten Pas, A., Gualtieri, M., Saenko, K., et al.: Grasp pose detection in point clouds. The International Journal of Robotics Research. 0278364917735594, (2017)
Schwarz, M., Milan, A., Periyasamy, A.S., et al.: RGB-D object detection and semantic segmentation for autonomous manipulation in clutter. The International Journal of Robotics Research. 0278364917713117, (2016)
Zeng, A., Yu, K.T., Song, S., et al.: Multi-view self-supervised deep learning for 6d pose estimation in the amazon picking challenge, robotics and automation (ICRA), 2017 IEEE international conference on. IEEE. 1386–1383 (2017)
Liu, W., Anguelov, D., Erhan, D., et al.: Ssd: Single Shot Multibox Detector, pp. 21–37. European conference on computer vision. Springer, Cham (2016)
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Simonyan, Karen, and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Rublee, E., Rabaud, V., Konolige, K., et al.: ORB: an efficient alternative to SIFT or SURF, computer vision (ICCV), 2011 IEEE international conference on. IEEE. 2564–2571 (2011)
Marquardt, Donald W. An algorithm for least-squares estimation of nonlinear parameters. Journal of the society for Industrial and Applied Mathematics 11.2 (1963): 431–441
Chen, Tianqi, et al. "Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems." arXiv preprint arXiv:1512.01274 (2015)
Viereck, Ulrich, et al. "Learning a visuomotor controller for real world robotic grasping using simulated depth images." arXiv preprint arXiv:1706.04652 (2017)
Acknowledgments
This work was supported by National Natural Science Foundation of China (51675329, 51675342, 51775332), Innovative Methods Program of Ministry of Science and Technology of China (2015IM010100), National Key Scientific Instruments and Equipment Development Program of China (2016YFF0101602), Shanghai Committee of Science and Technology (15142200800, 16441906000, 16XD1425000), and State Key Laboratory of Smart Manufacturing for Special Vehicles and Transmission System (GZ2016KF001).
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Liu, W., Hu, J. & Wang, W. A Novel Camera Fusion Method Based on Switching Scheme and Occlusion-Aware Object Detection for Real-Time Robotic Grasping. J Intell Robot Syst 100, 791–808 (2020). https://doi.org/10.1007/s10846-020-01236-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10846-020-01236-7