Abstract
This paper presents a framework of a marker-less human pose recognition system by identifying key body extremity parts through a network of calibrated low-cost depth sensors. The usage of depth sensors overcomes challenges related to low illuminations which usually compromises the information from the RGB cameras. Furthermore, the addition of multiple depth sensors complements the existing information with more visibility and less self-occlusion. A simple algorithm was applied which finds the connections between aligned and updated meshes produced from multiple sensors. These connections help to fuse the meshes into one large geodesic graph network. On this graph, a novel algorithm is applied to identify key body extremities such as head, hands, and feet of a human subject. A geodesic mapping is applied to the fused point cloud to produce a set of distinct topological clusters of 3D points. These clusters generate a hierarchical skeleton tree graph (Reeb graph) and produce a set of features for semantic identification of key body extremities. The combination of both the shape model and semantic classification finally leads to pose recognition. The paper presents the assessment of the proposed framework and its comparison with another available technique in a succession of experimental configurations.
Similar content being viewed by others
References
Grest, D., Woetzel, J., Koch, R.: Nonlinear body pose estimation from depth images. In: Pattern Recognition, Springer, Berlin, pp. 285–292 (2005)
Pons-Moll, G., Baak, A., Helten, T., Müller, M., Seidel, H. P., Rosenhahn, B.: Multisensor-fusion for 3D full-body human motion capture. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp. 663–670 (2010)
Sripama, S., Ganguly, B., Konar, A.: Gesture based improved human-computer interaction using Microsoft’s Kinect sensor. In: 2016 International Conference on Microelectronics, Computing and Communications (MicroCom) pp. 1–6 (2016).
Thome, N., Merad, D., Miguet, S.: Human body part labeling and tracking using graph matching theory. In: IEEE International Conference on Video and Signal Based Surveillance, AVSS’06, pp. 38–44 (2006).
Chuang, C. H., Lee, C. C., Chen, Y. N., Hsieh, J. W., Tsai, L. W.: Human body part segmentation of ınteracting people by learning blob models. In: Eighth International IEEE Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP), pp. 367–370 (2012).
Hong, S., Kim, Y.: Dynamic pose estimation using multiple RGB-D cameras. Sensors 18(11), 3865 (2018)
Shotton, J., et al.: Real-time human pose recognition in parts from single depth images. Commun. ACM 1(56), 116–124 (2013)
Plagemann, C, Ganaphathi, V., Koller, D., Thrun, S.: Real time ıdentification and localization of body parts from depth ımages. In: IEEE Conference on Robotics and Automation, pp. 3108–3113 (2010)
Schwarz, L.A., Mkhitaryan, A., Mateus, D., Navab, N.: Human skeleton tracking from depth data using geodesic distances and optical flow. Image Vis. Comput. 30(3), 217–226 (2012)
Handrich, S., Al-Hamadi, A.: Full-body human pose estimation by combining geodesic distances and 3D-point cloud registration. In: International Conference on Advanced Concepts for Intelligent Vision Systems, pp. 287–298 (2015)
Zhang, W., Kong, D., Wang, S., Wang, Z.: 3D human pose estimation from range images with depth difference and geodesic distance. J. Vis. Commun. Image Represent. 59, 272–282 (Feb. 2019)
Brandão, A., Fernandes, L., Clua, E.: M5AIE: a method for body part detection and tracking using RGB-D images. In: Proceedings of the 9th International Conference on Computer Vision Theory and Applications (VISAPP), vol. 1, pp. 367–377 (2014).
Phan, A., Ferrie, F.P.: Towards 3D human posture estimation using multiple kinects despite self-contacts. In: 14th IAPR International Conference in Machine Vision Applications (MVA), pp. 567–571 (2015)
Shafaei, A., Little, J. J.: Real-time human motion capture with multiple depth cameras. In: 13th Conference on Computer and Robot Vision (CRV), 2016 , pp. 24–31 (2016)
Besl, P.J., McKay, N.D.: A method for registration of 3-D shapes. IEEE Trans. Pattern Anal. Mach. Intell. 14(2), 239–256 (1992)
Mohsin, N., Payandeh, S.: Localization and identification of body extremities based on data from multiple Depth Sensors. In: IEEE International Conference on Systems, Man and Cybernetics (2017).
Auvinet, E., Meunier, J., Multon, F.: Multiple depth cameras calibration and body volume. In: Proceedings of the IEEE 2012 11th International Conference on Information Science, Signal Processing and their Applications (ISSPA), pp. 478–483 (2012)
Izadi, S. et al.: Kinectfusion: real-time 3D reconstruction and ınteraction using a moving depth camera. No 11.
Moeslund, T., Hilton, A., Krueger, V.: A survey of advances in vision-based human motion capture and analysis. J. Comput. Vis. Image Underst. 104(2), 90–126 (2006)
Chen, L., Wei, H., Ferryman, J.: A survey of human motion analysis using depth imagery. Pattern Recogn. Lett. 34(14), 1995–2006 (2013)
Zhang, H.-B., Lei, Q., Zhong, B.-N., Du, J.-X., Peng, J.: A Survey on human pose estimation. Intell. Autom. Soft. Comput. 22(3), 483–489 (2016)
Park, S., Aggarwal, J.K.: Segmentation and tracking of ınteracting human body parts under occlusion and shadowing. In: Proceedings of the Workshop on Motion and Video Computing, p. 105 (2002)
Liu, J., et al.: Feature boosting network for 3D pose estimation. IEEE Trans. Pattern Anal. Mach. Intell. 42(2), 494–501 (Feb. 2020)
Wu, J., Hu, D., Xiang, F., Yuan, X., Su, J.: 3D human pose estimation by depth map. Visual Comput. 36, 1–10 (2019)
Kanazawa, A., Zhang, J. Y., Felsen, P., Malik, J.: Learning 3D human dynamics from video
Zhou, X., Zhu, M., Leonardos, S., Derpanis, K. G., Daniilidis, K.: Sparseness meets deepness: 3D human pose estimation from monocular video. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2016, vol. 2016-December, pp. 4966–4975.
“Kinect - Windows app development.” [Online]. https://developer.microsoft.com/en-us/windows/kinect. Accessed: 11-May-2019
“Intel® RealSenseTM Depth and Tracking Cameras.” [Online]. https://www.intelrealsense.com/. Accessed 11-May-2019
Ganapathi, V., Plagemann, C., Koller, D., Thrun, S.: Real-time human pose tracking from range data. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 7577 LNCS, no. PART 6, pp. 738–751 (2012)
Shuai, L., Li, C., Guo, X., Prabhakaran, B., Chai, J.: Motion capture with ellipsoidal skeleton using multiple depth cameras. IEEE Trans. Vis. Comput. Gr. 23(2), 1085–1098 (2017)
Tierny, J., Vandeborre, J.-P., Daoudi, M.: Enhancing 3D mesh topological skeletons with discrete contour constrictions. Vis. Comput. 24(3), 155–172 (2008)
Hilaga, M., Shinagawa, Y., Kohmura, T., Kunii, T.L.: Topology matching for fully automatic similarity estimation of 3D shapes. In: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 2001, pp. 203–212 (2001)
Sundar, H., Silver, D., Gagvani, N., Dickinson, S.: Skeleton based shape matching and retrieval. In: Proceedings - SMI 2003: Shape Modeling International 2003, pp. 130–139 (2003)
Tung, T., Schmitt, F.: The augmented multiresolution Reeb graph approach for content-based retrieval of 3D shapes. Int. J. Shape Model. 11(1), 91–120 (Jun. 2005)
Natali, M., Biasotti, S., Patané, G., Falcidieno, B.: Graph-based representations of point clouds. Graph. Models 73(5), 151–164 (Sep. 2011)
Yang, H., Hao, K., Ding, Y.: Semantic segmentation of human model using heat kernel and geodesic distance. Math. Probl. Eng. (2018). https://doi.org/10.1155/2018/7974340
Xiao, Y., Siebert, P., Werghi, N.: Topological segmentation of discrete human body shapes in various postures based on geodesic distance. Proc. Int. Conf. Pattern Recognit. 3, 131–135 (2004)
Werghi, N., Xiao, Y., Siebert, J.P.: A functional-based segmentation of human body scans in arbitrary postures. IEEE Trans. Syst. Man Cybern. B Cybern. 36(1), 153–165 (Feb. 2006)
Kalogerakis, E., Hertzmann, A., Singh, K.: Learning 3D mesh segmentation and labeling. In: ACM SIGGRAPH 2010, pp. 102:1–102:12 (2010)
Karmakar, N., Biswas, A., Bhowmick, P.: Reeb graph based segmentation of articulated components of 3D digital objects. Theoret. Comput. Sci. 624, 25–40 (Apr. 2016)
Rodrigues, R.S.V., Morgado, J.F.M., Gomes, A.J.P.: Part-based mesh segmentation: a survey. Comput. Gr. Forum 37(6), 235–274 (Sep. 2018)
KaewTraKulPong P., Bowden, R.: An ımproved adaptive background mixture model for real time tracking with shadow detection. In: Proceedings of 2nd European Workshop on Advanced Video Based Surveillance System (2001)
Friedman, J.H., Bentely, J., Finkel, R.A.: An algorithm for finding best matches in logarithmic expected time. ACM Trans. Math. Softw. 3(3), 209–226 (1977)
Gonzalez, R.C., Woods, R.E., Eddins, S.L.: Digital image processing using MATLAB. Pearson Prentice Hall, New Jersey (2004)
Dijkstra, E.W.: A note on two problems in connexion with graphs. Numer. Math. 1(1), 269–271 (1959)
Cristianini, N., Shawe-Taylor, J.: An introduction to support vector machines and other Kernel-based learning methods. Cambridge University Press, 2000
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (Sep. 1995)
Cheng, W.X., Katuwal, R., Suganthan, P. N., Qiu, X.: A heterogeneous ensemble of trees. In: 2017 IEEE Symposium Series on Computational Intelligence, SSCI 2017 - Proceedings, vol. 2018-Janua, pp. 1–6 (2018)
Blake, J., Echtler, F., Kerl, C.: libfreenect2: Open source drivers for the Kinect for Windows v2 device (2015)
Carpenter, J., Clifford, P., Fearnhead, P.: Improved particle filter for nonlinear problems. IEE Proc. Radar, Sonar Navig. 146(1), 2 (1999)
Zarchan, P., Musoff, H.: Fundamentals of Kalman filtering : a practical approach. American Institute of Aeronautics and Astronautics, Reston (2000)
Vidanpathirana, M., Sudasingha, I., Vidanapathirana, J., Kanchana, P., Perera, I.: Tracking and frame-rate enhancement for real-time 2D human pose estimation. Vis. Comput. 36, 1–19 (2019)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of Interest
Nasreen Mohsin declares that she has no conflict of interest. Dr. Shahram Payandeh declares that he has no conflict of interest.
Rights and permissions
About this article
Cite this article
Mohsin, N., Payandeh, S. Clustering and Identification of key body extremities through topological analysis of multi-sensors 3D data. Vis Comput 38, 1097–1120 (2022). https://doi.org/10.1007/s00371-021-02070-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-021-02070-0