Grasp Pose Detection with Affordance-based Task Constraint Learning in Single-view Point Clouds

Qian, Kun; Jing, Xingshuo; Duan, Yanhui; Zhou, Bo; Fang, Fang; Xia, Jing; Ma, Xudong

doi:10.1007/s10846-020-01202-3

Grasp Pose Detection with Affordance-based Task Constraint Learning in Single-view Point Clouds

Published: 23 May 2020

Volume 100, pages 145–163, (2020)
Cite this article

Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Kun Qian^1,2,
Xingshuo Jing^1,2,
Yanhui Duan^1,2,
Bo Zhou^1,2,
Fang Fang^1,2,
Jing Xia^1,2 &
…
Xudong Ma^1,2

917 Accesses
27 Citations
Explore all metrics

Abstract

Learning to grasp novel objects is a challenging issue for service robots, especially when the robot is performing goal-oriented manipulation or interaction tasks whilst only single-view RGB-D sensor data is available. While some visual approaches focus on grasping that satisfy force-closure standards only, we further link affordances-based task constraints to the grasp pose on object parts, so that both force-closure standard and task constraints can be ensured. In this paper, a new single-view approach is proposed for task-constrained grasp pose detection. We propose to learn a pixel-level affordance detector based on a convolutional neural network. The affordance detector provides a fine grained understanding of the task constraints on objects, which are formulated as a pre-segmentation stage in the grasp pose detection framework. The accuracy and robustness of grasp pose detection are improved by a novel method for calculating local reference frame as well as a position-sensitive fully convolutional neural network for grasp stability classification. Experiments on benchmark datasets have shown that our method outperforms the state-of-the-art methods. We have also validated our method in real-world and task-specific grasping scenes, in which higher success rate for task-oriented grasping is achieved.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Auxiliary CNN for Graspability Modeling with 3D Point Clouds and Images for Robotic Grasping

A Grasp Pose Detection Network Based on the DeepLabv3+ Semantic Segmentation Model

Autonomous Perception and Grasp Generation Based on Multiple 3D Sensors and Deep Learning

References

Aldoma, A., Tombari, F., Di Stefano, L., Vincze, M.: A Global Hypotheses Verification Method for 3D Object Recognition. In: European Conference on Computer Vision, pp. 511–524. Springer (2012)
Bo, L., Ren, X., Fox, D.: Unsupervised Feature Learning for Rgb-D Based Object Recognition. In: Experimental Robotics, pp. 387–402. Springer (2013)
Calli, B., Singh, A., Bruce, J., Walsman, A., Konolige, K., Srinivasa, S., Abbeel, P., Dollar, A.M.: Yale-cmu-berkeley dataset for robotic manipulation research. Int. J. Robot. Res. 36(3), 261–268 (2017)
Article Google Scholar
Calli, B., Walsman, A., Singh, A., Srinivasa, S., Abbeel, P., Dollar, A.M.: Benchmarking in manipulation research: Using the yale-cmu-berkeley object and model set. IEEE Robot. Autom. Mag. 22(3), 36–52 (2015)
Article Google Scholar
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A. L.: Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2017)
Article Google Scholar
Dai, J., Li, Y., He, K., Sun, J.: R-Fcn: Object Detection via Region-Based Fully Convolutional Networks. In: Advances in Neural Information Processing Systems, pp. 379–387 (2016)
Do, T.T., Nguyen, A., Reid, I.: Affordancenet: an End-To-End Deep Learning Approach for Object Affordance Detection. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 1–5. IEEE (2018)
Gualtieri, M., Ten Pas, A., Saenko, K., Platt, R.: High Precision Grasp Pose Detection in Dense Clutter. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 598–605. IEEE (2016)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp. 2961–2969 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778 (2016)
Hermans, T., Rehg, J.M., Bobick, A.: Affordance Prediction via Learned Object Attributes. In: ICRA: Workshop on Semantic Perception, Mapping, and Exploration, Vol. 1. Citeseer (2011)
Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167 (2015)
Jiang, Y., Moseson, S., Saxena, A.: Efficient Grasping from Rgbd Images: Learning Using a New Rectangle Representation. In: 2011 IEEE International Conference on Robotics and Automation, pp. 3304–3311. IEEE (2011)
Jørgensen, T.B., Jensen, S.H.N., Aanæs, H., Hansen, N.W., Kruger, N.: An adaptive robotic system for doing pick and place operations with deformable objects. J. Intell. Robot. Syst. 94(1), 81–100 (2019)
Article Google Scholar
Kokic, M., Stork, J.A., Haustein, J.A., Kragic, D.: Affordance Detection for Task-Specific Grasping Using Deep Learning. In: 2017 IEEE-RAS 17Th International Conference on Humanoid Robotics (Humanoids), pp. 91–98. IEEE (2017)
Laga, H., Mortara, M., Spagnuolo, M.: Geometry and context for semantic correspondences and functionality recognition in man-made 3d shapes. ACM Trans. Graph. (TOG) 32(5), 150 (2013)
Article Google Scholar
Lakani, S.R., Rodríguez-sánche, A.J., Piater, J.: Towards affordance detection for robot manipulation using affordance for parts and parts for affordance. Auton. Robot. 43(5), 1155–1172 (2019)
Article Google Scholar
Lei, Q., Wisse, M.: Unknown Object Grasping Using Force Balance Exploration on a Partial Point Cloud. In: 2015 IEEE International Conference on Advanced Intelligent Mechatronics (AIM), pp. 7–14. IEEE (2015)
Lenz, I., Lee, H., Saxena, A.: Deep learning for detecting robotic grasps. Int. J. Robot. Res. (IJRR) 34 (4-5), 705–724 (2015)
Article Google Scholar
Liang, H., Ma, X., Li, S., Görner, M., Tang, S., Fang, B., Sun, F., Zhang, J.: Pointnetgpd: Detecting Grasp Configurations from Point Sets. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 3629–3635. IEEE (2019)
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2117–2125 (2017)
Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. http://www.cocodataset.org/ (2014)
Mahler, J., Liang, J., Niyaz, S., Laskey, M., Doan, R., Liu, X., Ojea, J. A., Goldberg, K.: Dex-net 2.0: Deep learning to plan robust grasps with synthetic point clouds and analytic grasp metrics. arXiv:1703.09312 (2017)
Margolin, R., Zelnik-Manor, L., Tal, A.: How to evaluate foreground maps?. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 248–255 (2014)
Miller, A.T., Allen, P.K.: Graspit! a versatile simulator for robotic grasping. IEEE Robot. Autom. Mag. 11(4), 110–122 (2004)
Article Google Scholar
Myers, A., Teo, C. L., Fermüller, C., Aloimonos, Y.: Affordance Detection of Tool Parts from Geometric Features. In: 2015 IEEE International Conference on Robotics and Automation (ICRA), pp. 1374–1381. IEEE (2015)
Nguyen, A., Kanoulas, D., Caldwell, D.G., Tsagarakis, N.G.: Detecting Object Affordances with Convolutional Neural Networks. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2765–2770. IEEE (2016)
Nguyen, A., Kanoulas, D., Caldwell, D.G., Tsagarakis, N. G.: Object-Based Affordances Detection with Convolutional Neural Networks and Dense Conditional Random Fields. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5908–5915. IEEE (2017)
Ni, P., Zhang, W., Bai, W., Lin, M., Cao, Q.: A new approach based on two-stream cnns for novel objects grasping in clutter. J. Intell. Robot. Syst. 94(1), 161–177 (2019)
Article Google Scholar
Pinto, L., Gupta, A.: Supersizing Self-Supervision: Learning to Grasp from 50K Tries and 700 Robot Hours. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 3406–3413. IEEE (2016)
Redmon, J., Angelova, A.: Real-Time Grasp Detection Using Convolutional Neural Networks. In: 2015 IEEE International Conference on Robotics and Automation (ICRA), pp. 1316–1322. IEEE (2015)
Rusu, R.B.: Semantic 3d object maps for everyday manipulation in human living environments. KI-Künstl. Intell. 24(4), 345–348 (2010)
Article Google Scholar
Sainath, T.N., Kingsbury, B., Saon, G., Soltau, H., Mohamed, A.R., Dahl, G., Ramabhadran, B.: Deep convolutional neural networks for large-scale speech tasks. Neural Netw. 64, 39–48 (2015)
Article Google Scholar
Sawatzky, J., Srikantha, A., Gall, J.: Weakly supervised affordance detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2795–2804 (2017)
Song, D., Ek, C.H., Huebner, K., Kragic, D.: Task-based robot grasp planning using probabilistic inference. IEEE Trans. Robot. 31(3), 546–561 (2015)
Article Google Scholar
Suzuki, T., Oka, T.: Grasping of Unknown Objects on a Planar Surface Using a Single Depth Image. In: 2016 IEEE International Conference on Advanced Intelligent Mechatronics (AIM), pp. 572–577. IEEE (2016)
Ten Pas, A., Gualtieri, M., Saenko, K., Platt, R.: Grasp pose detection in point clouds. Int. J. Robot. Res. 36(13-14), 1455–1473 (2017)
Article Google Scholar
Ten Pas, A., Platt, R.: Using Geometry to Detect Grasp Poses in 3D Point Clouds. In: Robotics Research, pp. 307–324. Springer (2018)
Vahrenkamp, N., Westkamp, L., Yamanobe, N., Aksoy, E.E., Asfour, T.: Part-Based Grasp Planning for Familiar Objects. In: 2016 IEEE-RAS 16Th International Conference on Humanoid Robots (Humanoids), pp. 919–925. IEEE (2016)
Zhou, Y., Tuzel, O.: Voxelnet: End-to-end learning for point cloud based 3d object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4490–4499 (2018)

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (Grant No. 61573101 and 61573100) and the Fundamental Research Funds for the Central Universities(No. 2242019k30017).

Author information

Authors and Affiliations

School of Automation, Southeast University, Nanjing, 210096, China
Kun Qian, Xingshuo Jing, Yanhui Duan, Bo Zhou, Fang Fang, Jing Xia & Xudong Ma
Key Laboratory of Measurement and Control of Complex Systems of Engineering, Ministry of Education, Nanjing, 210096, China
Kun Qian, Xingshuo Jing, Yanhui Duan, Bo Zhou, Fang Fang, Jing Xia & Xudong Ma

Authors

Kun Qian
View author publications
You can also search for this author in PubMed Google Scholar
Xingshuo Jing
View author publications
You can also search for this author in PubMed Google Scholar
Yanhui Duan
View author publications
You can also search for this author in PubMed Google Scholar
Bo Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Fang Fang
View author publications
You can also search for this author in PubMed Google Scholar
Jing Xia
View author publications
You can also search for this author in PubMed Google Scholar
Xudong Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kun Qian.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Qian, K., Jing, X., Duan, Y. et al. Grasp Pose Detection with Affordance-based Task Constraint Learning in Single-view Point Clouds. J Intell Robot Syst 100, 145–163 (2020). https://doi.org/10.1007/s10846-020-01202-3

Download citation

Received: 19 October 2019
Accepted: 13 April 2020
Published: 23 May 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s10846-020-01202-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Grasp Pose Detection with Affordance-based Task Constraint Learning in Single-view Point Clouds

Abstract

Access this article

Similar content being viewed by others

Auxiliary CNN for Graspability Modeling with 3D Point Clouds and Images for Robotic Grasping

A Grasp Pose Detection Network Based on the DeepLabv3+ Semantic Segmentation Model

Autonomous Perception and Grasp Generation Based on Multiple 3D Sensors and Deep Learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Grasp Pose Detection with Affordance-based Task Constraint Learning in Single-view Point Clouds

Abstract

Access this article

Similar content being viewed by others

Auxiliary CNN for Graspability Modeling with 3D Point Clouds and Images for Robotic Grasping

A Grasp Pose Detection Network Based on the DeepLabv3+ Semantic Segmentation Model

Autonomous Perception and Grasp Generation Based on Multiple 3D Sensors and Deep Learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation