Abstract
Learning to grasp novel objects is a challenging issue for service robots, especially when the robot is performing goal-oriented manipulation or interaction tasks whilst only single-view RGB-D sensor data is available. While some visual approaches focus on grasping that satisfy force-closure standards only, we further link affordances-based task constraints to the grasp pose on object parts, so that both force-closure standard and task constraints can be ensured. In this paper, a new single-view approach is proposed for task-constrained grasp pose detection. We propose to learn a pixel-level affordance detector based on a convolutional neural network. The affordance detector provides a fine grained understanding of the task constraints on objects, which are formulated as a pre-segmentation stage in the grasp pose detection framework. The accuracy and robustness of grasp pose detection are improved by a novel method for calculating local reference frame as well as a position-sensitive fully convolutional neural network for grasp stability classification. Experiments on benchmark datasets have shown that our method outperforms the state-of-the-art methods. We have also validated our method in real-world and task-specific grasping scenes, in which higher success rate for task-oriented grasping is achieved.
Similar content being viewed by others
References
Aldoma, A., Tombari, F., Di Stefano, L., Vincze, M.: A Global Hypotheses Verification Method for 3D Object Recognition. In: European Conference on Computer Vision, pp. 511–524. Springer (2012)
Bo, L., Ren, X., Fox, D.: Unsupervised Feature Learning for Rgb-D Based Object Recognition. In: Experimental Robotics, pp. 387–402. Springer (2013)
Calli, B., Singh, A., Bruce, J., Walsman, A., Konolige, K., Srinivasa, S., Abbeel, P., Dollar, A.M.: Yale-cmu-berkeley dataset for robotic manipulation research. Int. J. Robot. Res. 36(3), 261–268 (2017)
Calli, B., Walsman, A., Singh, A., Srinivasa, S., Abbeel, P., Dollar, A.M.: Benchmarking in manipulation research: Using the yale-cmu-berkeley object and model set. IEEE Robot. Autom. Mag. 22(3), 36–52 (2015)
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A. L.: Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2017)
Dai, J., Li, Y., He, K., Sun, J.: R-Fcn: Object Detection via Region-Based Fully Convolutional Networks. In: Advances in Neural Information Processing Systems, pp. 379–387 (2016)
Do, T.T., Nguyen, A., Reid, I.: Affordancenet: an End-To-End Deep Learning Approach for Object Affordance Detection. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 1–5. IEEE (2018)
Gualtieri, M., Ten Pas, A., Saenko, K., Platt, R.: High Precision Grasp Pose Detection in Dense Clutter. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 598–605. IEEE (2016)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp. 2961–2969 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778 (2016)
Hermans, T., Rehg, J.M., Bobick, A.: Affordance Prediction via Learned Object Attributes. In: ICRA: Workshop on Semantic Perception, Mapping, and Exploration, Vol. 1. Citeseer (2011)
Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv:1502.03167 (2015)
Jiang, Y., Moseson, S., Saxena, A.: Efficient Grasping from Rgbd Images: Learning Using a New Rectangle Representation. In: 2011 IEEE International Conference on Robotics and Automation, pp. 3304–3311. IEEE (2011)
Jørgensen, T.B., Jensen, S.H.N., Aanæs, H., Hansen, N.W., Kruger, N.: An adaptive robotic system for doing pick and place operations with deformable objects. J. Intell. Robot. Syst. 94(1), 81–100 (2019)
Kokic, M., Stork, J.A., Haustein, J.A., Kragic, D.: Affordance Detection for Task-Specific Grasping Using Deep Learning. In: 2017 IEEE-RAS 17Th International Conference on Humanoid Robotics (Humanoids), pp. 91–98. IEEE (2017)
Laga, H., Mortara, M., Spagnuolo, M.: Geometry and context for semantic correspondences and functionality recognition in man-made 3d shapes. ACM Trans. Graph. (TOG) 32(5), 150 (2013)
Lakani, S.R., Rodríguez-sánche, A.J., Piater, J.: Towards affordance detection for robot manipulation using affordance for parts and parts for affordance. Auton. Robot. 43(5), 1155–1172 (2019)
Lei, Q., Wisse, M.: Unknown Object Grasping Using Force Balance Exploration on a Partial Point Cloud. In: 2015 IEEE International Conference on Advanced Intelligent Mechatronics (AIM), pp. 7–14. IEEE (2015)
Lenz, I., Lee, H., Saxena, A.: Deep learning for detecting robotic grasps. Int. J. Robot. Res. (IJRR) 34 (4-5), 705–724 (2015)
Liang, H., Ma, X., Li, S., Görner, M., Tang, S., Fang, B., Sun, F., Zhang, J.: Pointnetgpd: Detecting Grasp Configurations from Point Sets. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 3629–3635. IEEE (2019)
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2117–2125 (2017)
Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. http://www.cocodataset.org/ (2014)
Mahler, J., Liang, J., Niyaz, S., Laskey, M., Doan, R., Liu, X., Ojea, J. A., Goldberg, K.: Dex-net 2.0: Deep learning to plan robust grasps with synthetic point clouds and analytic grasp metrics. arXiv:1703.09312 (2017)
Margolin, R., Zelnik-Manor, L., Tal, A.: How to evaluate foreground maps?. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 248–255 (2014)
Miller, A.T., Allen, P.K.: Graspit! a versatile simulator for robotic grasping. IEEE Robot. Autom. Mag. 11(4), 110–122 (2004)
Myers, A., Teo, C. L., Fermüller, C., Aloimonos, Y.: Affordance Detection of Tool Parts from Geometric Features. In: 2015 IEEE International Conference on Robotics and Automation (ICRA), pp. 1374–1381. IEEE (2015)
Nguyen, A., Kanoulas, D., Caldwell, D.G., Tsagarakis, N.G.: Detecting Object Affordances with Convolutional Neural Networks. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2765–2770. IEEE (2016)
Nguyen, A., Kanoulas, D., Caldwell, D.G., Tsagarakis, N. G.: Object-Based Affordances Detection with Convolutional Neural Networks and Dense Conditional Random Fields. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5908–5915. IEEE (2017)
Ni, P., Zhang, W., Bai, W., Lin, M., Cao, Q.: A new approach based on two-stream cnns for novel objects grasping in clutter. J. Intell. Robot. Syst. 94(1), 161–177 (2019)
Pinto, L., Gupta, A.: Supersizing Self-Supervision: Learning to Grasp from 50K Tries and 700 Robot Hours. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 3406–3413. IEEE (2016)
Redmon, J., Angelova, A.: Real-Time Grasp Detection Using Convolutional Neural Networks. In: 2015 IEEE International Conference on Robotics and Automation (ICRA), pp. 1316–1322. IEEE (2015)
Rusu, R.B.: Semantic 3d object maps for everyday manipulation in human living environments. KI-Künstl. Intell. 24(4), 345–348 (2010)
Sainath, T.N., Kingsbury, B., Saon, G., Soltau, H., Mohamed, A.R., Dahl, G., Ramabhadran, B.: Deep convolutional neural networks for large-scale speech tasks. Neural Netw. 64, 39–48 (2015)
Sawatzky, J., Srikantha, A., Gall, J.: Weakly supervised affordance detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2795–2804 (2017)
Song, D., Ek, C.H., Huebner, K., Kragic, D.: Task-based robot grasp planning using probabilistic inference. IEEE Trans. Robot. 31(3), 546–561 (2015)
Suzuki, T., Oka, T.: Grasping of Unknown Objects on a Planar Surface Using a Single Depth Image. In: 2016 IEEE International Conference on Advanced Intelligent Mechatronics (AIM), pp. 572–577. IEEE (2016)
Ten Pas, A., Gualtieri, M., Saenko, K., Platt, R.: Grasp pose detection in point clouds. Int. J. Robot. Res. 36(13-14), 1455–1473 (2017)
Ten Pas, A., Platt, R.: Using Geometry to Detect Grasp Poses in 3D Point Clouds. In: Robotics Research, pp. 307–324. Springer (2018)
Vahrenkamp, N., Westkamp, L., Yamanobe, N., Aksoy, E.E., Asfour, T.: Part-Based Grasp Planning for Familiar Objects. In: 2016 IEEE-RAS 16Th International Conference on Humanoid Robots (Humanoids), pp. 919–925. IEEE (2016)
Zhou, Y., Tuzel, O.: Voxelnet: End-to-end learning for point cloud based 3d object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4490–4499 (2018)
Acknowledgements
This work is supported by the National Natural Science Foundation of China (Grant No. 61573101 and 61573100) and the Fundamental Research Funds for the Central Universities(No. 2242019k30017).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Qian, K., Jing, X., Duan, Y. et al. Grasp Pose Detection with Affordance-based Task Constraint Learning in Single-view Point Clouds. J Intell Robot Syst 100, 145–163 (2020). https://doi.org/10.1007/s10846-020-01202-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10846-020-01202-3