Elsevier

Pattern Recognition

Volume 111, March 2021, 107691
Pattern Recognition

KDD: A kernel density based descriptor for 3D point clouds

https://doi.org/10.1016/j.patcog.2020.107691Get rights and content

Highlights

  • A novel 3D local descriptor (KDD) which achieves a satisfactory and balanced performance in terms of descriptiveness, robustness, and compactness, is proposed; furthermore, the proposed KDD is very simple since it encodes the spatial distribution of points, avoiding computing any geometric attributes and needing no rotational projection operations.

  • The KDD is combined with different matching metrics for different datasets and the strategy for selecting different matching metrics for datasets with diverse levels of resolution qualities is provided.

  • We apply the proposed method on a real-world dataset, i.e. the Terracotta fragment models, and the favorable results demonstrate the effectiveness of KDD and highlight the utility of the proposed method.

Abstract

3D feature description is one of the central techniques that rely on point clouds since a lot of point cloud processing techniques apply the point-to-point correspondences that are achieved via feature descriptors as input data. The feature descriptor encodes the information of the underlying surface around the feature point so as to make a local surface distinguished from another. The focus of the existing descriptors is accumulating the geometric or topological measurements into histograms or encoding the 2D images that are acquired by rotationally projecting the 3D local surfaces onto 2D planes. Histograms can hardly deal with three or more dimensional information, and the rotational projection operation does bring much unnecessary intermediate computations. To overcome these limitations, in this article, a descriptor named Kernel Density Descriptor (KDD) has been presented. One core contribution of this method is to encode the information of the whole 3D space around the feature point via kernel density estimation, and another is providing the strategy for selecting different matching metrics for datasets with diverse levels of resolution qualities. We compare KDD against several representative descriptors on publicly available datasets, the experimental results demonstrate that the KDD descriptor achieves a satisfactory and balanced performance in terms of descriptiveness, robustness, and compactness, furthermore, the comparisons validate the overall superiority of our method. The benefits and applicability on object registration and recognition and 3D object reconstruction are demonstrated by the favorable results that are obtained for both public datastes and the real-world point clouds of Terracotta fragments.

Introduction

Feature point descriptor for point clouds is a fundamental representation structure in 3D modeling [1], 3D object recognition [2], and 3D model registration [3]. Therefore, the 3D local feature descriptors construction is at the core of many computer graphics and computer vision technologies.

To be useful for downstream tasks, the descriptors need to satisfy several performance requirements, such as descriptiveness, robustness, compactness, efficiency and the memory footprint [4]. Among these requirements, descriptiveness and robustness are considered to be two of the most essential aspects of a descriptor. A feature descriptor is descriptive if it can encode predominant and sufficient information about the local surface. That is to say, a descriptor should provide sufficient information to make a local surface distinguished from another. A feature is robust if it is insensitive to several factors that can disturb the surface, e.g., noises and non-uniformly sampling density.

Local features have been extensively investigated over the past few decades to design descriptors that are distinctive and robust, thus the existing descriptors have been comprehensively evaluated and analyzed to improve all aspects of performance [4]. So far, there are mainly three modes to directly construct 3D local feature descriptors, except for deep learning-based methods.

One best-known group of descriptors is the histogram-based descriptors, for example, Signature of Histogram of Orientations (SHOT) [5]. Histogram is a very simple and functional representation for one or two-dimensional information, thus making it practical for the density estimation and rapid visualization of data in one or two dimensions. However, the drawbacks of histograms are a bit obvious: (1) The starting location and bandwidth of the bins seriously affect the performance; (2) the curse of dimensionality leads to the number of bins grows exponentially with the number of dimensions; and (3) histograms can hardly deal with three or more dimensional information, which is a much more serious problem in representation tasks in the 3D domains.

Another category of descriptors employs binary patterns to describe the underlying surfaces, including 3D Local Binary Patterns (3D-LBP) [6] and Binary SHOT (B-SHOT) [7], et al. Binary descriptors prevail over histogram-based descriptors in terms of compactness and efficiency, but achieve relatively lower scores in descriptiveness [8]. Furthermore, several 3D binary descriptors apply to the voxel model only [6].

Alternatively, the rotation and projection mechanism generates descriptors by first repeatedly rotating and projecting local surface onto 2D planes and then encoding the producing 2D images or range images. A representative descriptor is Rotational Projection Statistics (RoPS) [9]. The rotation and projection mechanism also gets satisfactory results, however, the rotation and projection operations do bring lots of unnecessary intermediate computations.

From an intuitive point of view, the geometry of the underlying surface around a feature point is shaped by the distribution of its neighboring points. That is to say, different shapes have different point’s densities in different regions of the 3D space around the feature points. Furthermore, two points with the same Euclidean distance may form different shapes. This is because the points’ contribution to the underlying shape differs on different axes of the local reference frame (LRF). Based on the observations above, instead of computing the geometric attributes of points in the support region, we convert the description of the feature point to the kernel density estimation of the 3D space around the feature point, with different point’s densities computed in different regions. Therefore, a novel descriptor that is termed kernel density descriptor (KDD) is proposed in this work. First, we compute the LRF at the feature point and exploit a cubic support region around the feature point that is aligned with the LRF. Then the kernel density of each cube in the 3D space is estimated via kernel density estimation which overcomes the limitations of histograms. Note that, in order to promote the robustness to point non-uniformity, instead of only considering the non-empty cubes as in [10], all the cubes in the support region participate in the density estimation, no matter it is empty or not. Moreover, rather than evaluating different feature descriptors using different matching metrics [11], the KDD was combined with different matching metrics for different datasets and the strategy for selecting different matching metrics for datasets with diverse levels of resolution qualities was provided. We compare the proposed KDD against several representative descriptors on publicly available datasets, and the experimental results demonstrate that the KDD descriptor is more descriptive and achieves the highest recalls compared to the tested descriptors on the selected datasets, for both at different levels of noises and at different simplification rates. Besides, the KDD provides invariance to rigid transformations and achieves satisfactory results in terms of robustness and compactness. We also apply the proposed method on a real-world dataset, i.e. the Terracotta fragment models, and the favorable results demonstrate the effectiveness of KDD and highlight the utility of the proposed method.

The main contributions of this work include:

  • (1)

    the presentation of a 3D local descriptor (KDD) which achieves a satisfactory and balanced performance in terms of descriptiveness, robustness, and compactness, furthermore, it is simple since it encodes the spatial distribution of points, avoiding computing any geometric attributes and needing no rotational projection operations;

  • (2)

    the presentation of the strategy for selecting different matching metrics for datasets with diverse levels of resolution qualities. In particular, difference-based, correlation-based, and entropy-based distances are all suitable for datasets with a high or medium level of resolution quality. However, only correlation-based (such as Cosin distance and Pearson correlation coefficient) and entropy-based distances (for example, KL divergence) are suitable for datasets with a low level of resolution quality.

Section snippets

Related works

Several researchers have devoted great efforts to solve the local surface description problem in the past few decades and a diverse of 3D local descriptors are presented. Among the state-of-the-art descriptors, some are generated directly on the point clouds by encoding the information of the local surfaces into histograms, whereas the others are constructed on 2D images acquired by rotating and projecting the underlying surfaces onto 2D planes. Therefore we categorize these algorithms into

Overview

The unorganized point cloud is the processing target throughout the life cycle of the proposed method(as shown in Fig. 1(a)), which means that the proposed method deals with no other data types such as 2D images, range images or meshes. The construction of the proposed KDD is very simple since it only contains two phases (please refer to Fig. 1 for the pipeline), without any geometric attributes estimation and rotational projection operations.

First, the proposed method computes the LRFs at key

Experimental results

This section is organized as follows. The experimental setup is first introduced in detail, including datasets, descriptors, evaluation criteria used in the experiments and the experimental procedures. Second, the performance of KDD is evaluated and compared with several representative descriptors, in terms of the descriptiveness, compactness, robustness and time efficiency. Finally, we summarize the overall performance of the proposed KDD.

Applications

The output of our method is a set of point-to-point correspondences on two given point clouds. A lot of techniques that rely on point sets apply the point-to-point correspondences as input data, such as 3D object registration [3], 3D object recognition [2] and 3D object reconstruction [37]. In this section, some matching results on the public datasets and real-world datasets are presented to intuitively demonstrate the effectiveness of KDD and highlight the utility of the proposed method.

Given

Conclusion

We presented a 3D local feature descriptor that is termed KDD. In our method, we first compute the LRFs at key points and build the cubic support region that serves to the kernel density estimation and meanwhile promote the robustness of the proposed KDD descriptor. Based on the cubic support region, we estimate the kernel densities of the local cubes. Then the density estimates are aggregated into a vector and thereby forming the KDD descriptor. Consequently, the true point correspondences can

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

The authors would like to thank all the reviewers for their valuable comments, the providers of the three datasets and the contributors of Point Cloud Library. The terracotta models are provided by Institute of Visualization of Northwest University. This work was supported by the National Natural Science Foundation of China (NSFC) (61802311, 61772421, 61731015, 61802244, 61602380), Key R&D Program of Shaanxi Province (2019SF-272), Natural Science Basic Research Plan in Shaanxi Province of China

Yuhe Zhang received the B.S. degree in software engineering and Ph.D. degree in computer applied technology from Northwest University, Xi'an, China in 2012 and 2017. From July 2017 to July 2020, she was a lecturer at School of Information Science and Technology, Northwest University of China, Xi'an, China. Since August 2020, she has been an Associate Professor at School of Information Science and Technology, Northwest University of China. Her research interest includes computer graphics, image

References (37)

  • Y. He et al.

    An iterative closest points algorithm for registration of 3d laser scanner point clouds with geometric features

    Sensors

    (2017)
  • Y. Guo et al.

    A comprehensive performance evaluation of 3d local feature descriptors

    Int. J. Comput. Vis.

    (2015)
  • T. Matsuda et al.

    Lightweight binary voxel shape features for 3d data matching and retrieval

    Proceedings - 2015 IEEE International Conference on Multimedia Big Data, BigMM 2015

    (2015)
  • S.M. Prakhya et al.

    B-SHOT: A binary feature descriptor for fast and efficient keypoint matching on 3d point clouds

    IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS

    (2015)
  • Y. Guo et al.

    Rotational projection statistics for 3d local surface description and object recognition

    Int. J. Comput. Vis.

    (2013)
  • K. Tang et al.

    Signature of geometric centroids for 3d local shape description and partial shape matching

    Asian Conference on Computer Vision

    (2016)
  • A. Mian et al.

    Three-dimensional model-based object recognition and segmentation in cluttered scenes

    IEEE Trans. Pattern Anal. Mach.Intell.

    (2006)
  • A. Johnson et al.

    Using spin images for efficient object recognition in cluttered 3d scenes

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1999)
  • Cited by (24)

    • VFMVAC: View-filtering-based multi-view aggregating convolution for 3D shape recognition and retrieval

      2022, Pattern Recognition
      Citation Excerpt :

      the proposed framework achieves state-of-the-art recognition and retrieval performance on benchmark datasets. Shape descriptors are obtained by processing a whole shape, which is different from the surface patch descriptors (such as RoPS [23] and KDD [24]), which only describe the surface patches around key points. Owing to focusing on the whole shape, the shape descriptors are more effective for object recognition, and have developed rapidly in the last decade.

    View all citing articles on Scopus

    Yuhe Zhang received the B.S. degree in software engineering and Ph.D. degree in computer applied technology from Northwest University, Xi'an, China in 2012 and 2017. From July 2017 to July 2020, she was a lecturer at School of Information Science and Technology, Northwest University of China, Xi'an, China. Since August 2020, she has been an Associate Professor at School of Information Science and Technology, Northwest University of China. Her research interest includes computer graphics, image processing, intelligent information processing and the digital restoration of cultural heritage.

    Chunhui Li received his B.S. from Northwest University of China in June 2020, he is currently working toward the M.S. degree in Northwest University of China. His research interests include computer graphics, and digital restoration of cultural heritage. In 2019, he won the Meritorious Winner Awar in the MCM/ICM competition.

    Bao Guo received his B.S. degree from Northwest University of China in 2019. He is currently working toward the M.S degree in Northwest University of China. His research interests include point cloud registration, feature extraction and the digital restoration of cultural heritage.

    Chenhao Guo is working toward the B.S. degree in Northwest University of China. His research interests include computer vision and point cloud registration.

    Shunli Zhang received the B.S. degree in applied mathematics from Xidian University, Xi'an, China, in 1997, and the M.S. and Ph.D. degree in aeronautical and astronautical manufacturing engineering from Northwestern Polytechnical University, Xi'an, China, in 2004 and 2010, respectively. From 2011 to 2014, he was a postdoc researcher with Northwestern Polytechnical University. Since 2014, he has been a professor at School of Information Science and Technology, Northwest University, Xi'an, China. His research interest includes computed tomography, image processing, biomedical imaging and parallel computing.

    Fully documented templates are available in the elsarticle package on CTAN.

    View full text