Skip to main content
Log in

A Novel Camera Fusion Method Based on Switching Scheme and Occlusion-Aware Object Detection for Real-Time Robotic Grasping

  • Published:
Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Abstract

Real-time vision-based robotic grasping is challenging in clutter. In such scene, the target object should be perceived accurately, where it may be occluded and misrecognized by many distractors including irrelevant objects and the robotic arm. In addition, the limited field of view (FOV) of camera makes it prone for objects to get out of the camera view. We develop a novel camera fusion method of pose estimation based on switching scheme for real-time robotic grasping under hybrid eye-in-hand (EIH)/eye-to-hand (ETH) configurations. The objects are locked based on occlusion-aware object detection to apply switching function for single pose estimation or multiple vision fusion. This method improves the accuracy of pose estimation and robustness of dynamic grasping under occlusion. Experimental results on pose estimation and real-time robotic grasping in clutter verify the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Zeng, A., Song, S., Yu, K. T., Donlon, E., Hogan, F. R., Bauza, M., ... & Fazeli, N: Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching. In 2018 IEEE International Conference on Robotics and Automation (ICRA) (pp. 1–8), 2018

  2. Dong, G., Zhu, Z.H.: Position-based visual servo control of autonomous robotic manipulators. Acta Astronautica. 115, 291–302 (2015)

    Article  Google Scholar 

  3. Caldera, S., Rassau, A., Chai, D.: Review of deep learning methods in robotic grasp detection. Multimodal Technologies and Interaction. 2(3), 57 (2018)

    Article  Google Scholar 

  4. Morrison, D., Corke, P., Leitner, J.: Closing the loop for robotic grasping: a real-time, generative grasp synthesis approach, in. Robotics, Science and Systems (RSS) (2018)

    Google Scholar 

  5. L. Berscheid, T. Rühr, and T. Kröger, Improving data efficiency of self-supervised learning for robotic grasping. International Conference on Robotics and Automation (ICRA). IEEE, 2019: 2125–2131

  6. Pinto, Lerrel, and Abhinav Gupta. Supersizing self-supervision: Learning to grasp from 50k tries and 700 robot hours. 2016 IEEE international conference on robotics and automation (ICRA). IEEE, 2016 : 3406–3413

  7. Lenz, I., Lee, H., Saxena, A.: Deep learning for detecting robotic grasps. The International Journal of Robotics Research. 34(4–5), 705–724 (2015)

    Article  Google Scholar 

  8. Levine, S.: Et al. end-to-end training of deep visuomotor policies. The. J. Mach. Learn. Res. 17(1), 1334–1373 (2016)

    Google Scholar 

  9. Levine, S., Pastor, P., Krizhevsky, A., Ibarz, J., Quillen, D.: Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection. The International Journal of Robotics Research. 37(4–5), 421–436 (2018)

    Article  Google Scholar 

  10. Hutchinson, S., Hager, G.D., Corke, P.I.: A tutorial on visual servo control. IEEE Trans. Robot. Autom. 12(5), 651–670 (1996)

    Article  Google Scholar 

  11. Chaumette, F., Hutchinson, S.: Visual servo control. Part I: Basic approaches. IEEE Robot. Automat. Mag. 13, 82–90 (2006)

    Google Scholar 

  12. Corke, P.: MATLAB toolboxes: robotics and vision for students and teachers. IEEE Robotics & automation magazine. 14(4), 16–17 (2007)

    Article  Google Scholar 

  13. Kneip, Laurent, Davide Scaramuzza, and Roland Siegwart. A novel parametrization of the perspective-three-point problem for a direct computation of absolute camera position and orientation. CVPR 2011. IEEE, 2011

  14. Janabi-Sharifi, F., Marey, M.: A kalman-filter-based method for pose estimation in visual servoing. IEEE Trans. Robot. 26(5), 939–947 (2010)

    Article  Google Scholar 

  15. Salehian, M., RayatDoost, S., Taghirad, H.D.: Robust unscented Kalman filter for visual servoing system, control, instrumentation and automation (ICCIA), 2011 2nd international conference on. IEEE. 1006–1011 (2011)

  16. Larouche, B.P., Zhu, Z.H.: Autonomous robotic capture of non-cooperative target using visual servoing and motion predictive control. Auton. Robot. 37(2), 157–167 (2014)

    Article  Google Scholar 

  17. Bloesch, Michael, et al. Robust visual inertial odometry using a direct EKF-based approach. 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, 2015

  18. Nützi, Gabriel, et al. Fusion of IMU and vision for absolute scale estimation in monocular SLAM. Journal of intelligent & robotic systems 61.1–4 (2011): 287–299

  19. Corke, Peter, Jorge Lobo, and Jorge Dias. An introduction to inertial and visual sensing. (2007): 519–535

  20. Qin, T., Li, P., Shen, S.: Vins-mono: a robust and versatile monocular visual-inertial state estimator. IEEE Trans. Robot. 34(4), 1004–1020 (2018)

    Article  Google Scholar 

  21. Qin, Tong, and Shaojie Shen. Online temporal calibration for monocular visual-inertial systems. 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2018

  22. Lippiello, V., Siciliano, B., Villani, L.: Position-based visual servoing in industrial multirobot cells using a hybrid camera configuration. IEEE Trans. Robot. 23(1), 73–86 (2007)

    Article  Google Scholar 

  23. Wang, Y., Zhang, G., Lang, H., Zuo, B., de Silva, C.W.: A modified image-based visual servo controller with hybrid camera configuration for robust robotic grasping. Robot. Auton. Syst. 62(10), 1398–1407 (2014)

    Article  Google Scholar 

  24. Assa, A., Janabi-Sharifi, F.: A robust vision-based sensor fusion approach for real-time pose estimation. IEEE transactions on cybernetics. 44(2), 217–227 (2014)

    Article  Google Scholar 

  25. Assa, A., Janabi-Sharifi, F.: Virtual visual servoing for multicamera pose estimation. IEEE/ASME Transactions on Mechatronics. 20(2), 789–798 (2015)

    Article  Google Scholar 

  26. Redmon, J., Angelova, A.: Real-time grasp detection using convolutional neural networks, robotics and automation (ICRA), 2015 IEEE international conference on. IEEE. 1316–1322 (2015)

  27. Kumra, S., Kanan, C.: Robotic grasp detection using deep convolutional neural networks, intelligent robots and systems (IROS), 2017 IEEE/RSJ international conference on. IEEE. 769–776 (2017)

  28. Liu, W., Pan, Z., Liu, W., et al.: Deep learning for picking point detection in dense clutter, control conference (ASCC), 2017 11th Asian. IEEE. 1644–1649 (2017)

  29. ten Pas, A., Gualtieri, M., Saenko, K., et al.: Grasp pose detection in point clouds. The International Journal of Robotics Research. 0278364917735594, (2017)

  30. Schwarz, M., Milan, A., Periyasamy, A.S., et al.: RGB-D object detection and semantic segmentation for autonomous manipulation in clutter. The International Journal of Robotics Research. 0278364917713117, (2016)

  31. Zeng, A., Yu, K.T., Song, S., et al.: Multi-view self-supervised deep learning for 6d pose estimation in the amazon picking challenge, robotics and automation (ICRA), 2017 IEEE international conference on. IEEE. 1386–1383 (2017)

  32. Liu, W., Anguelov, D., Erhan, D., et al.: Ssd: Single Shot Multibox Detector, pp. 21–37. European conference on computer vision. Springer, Cham (2016)

    Google Scholar 

  33. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  34. Simonyan, Karen, and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  35. Rublee, E., Rabaud, V., Konolige, K., et al.: ORB: an efficient alternative to SIFT or SURF, computer vision (ICCV), 2011 IEEE international conference on. IEEE. 2564–2571 (2011)

  36. Marquardt, Donald W. An algorithm for least-squares estimation of nonlinear parameters. Journal of the society for Industrial and Applied Mathematics 11.2 (1963): 431–441

  37. Chen, Tianqi, et al. "Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems." arXiv preprint arXiv:1512.01274 (2015)

  38. Viereck, Ulrich, et al. "Learning a visuomotor controller for real world robotic grasping using simulated depth images." arXiv preprint arXiv:1706.04652 (2017)

Download references

Acknowledgments

This work was supported by National Natural Science Foundation of China (51675329, 51675342, 51775332), Innovative Methods Program of Ministry of Science and Technology of China (2015IM010100), National Key Scientific Instruments and Equipment Development Program of China (2016YFF0101602), Shanghai Committee of Science and Technology (15142200800, 16441906000, 16XD1425000), and State Key Laboratory of Smart Manufacturing for Special Vehicles and Transmission System (GZ2016KF001).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Jie Hu or Weiming Wang.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, W., Hu, J. & Wang, W. A Novel Camera Fusion Method Based on Switching Scheme and Occlusion-Aware Object Detection for Real-Time Robotic Grasping. J Intell Robot Syst 100, 791–808 (2020). https://doi.org/10.1007/s10846-020-01236-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10846-020-01236-7

Keywords

Navigation