An anchor-based graph method for detecting and classifying indoor objects from cluttered 3D point clouds
Introduction
Fast and stable detection and classification of indoor objects from scanned point clouds have been instrumental in many applications, such as autonomous vehicles (Mattausch et al., 2014, Meyer et al., 2019, Naseer et al., 2018), indoor reconstruction (Kang et al., 2020, Li et al., 2018a, Sharif et al., 2017, Wang et al., 2017), robotics (Breuer et al., 2011, Li et al., 2020) and city planning (Rui et al., 2018, Vosselman et al., 2017, Yousefhussien et al., 2018). Moreover, recent advances in scanning technology greatly accelerate data acquisition (Gupta et al., 2015, Rui et al., 2018, Yulan et al., 2014) and improve the accuracy of the scanned point cloud (Mattausch et al., 2014, Ochmann et al., 2019, Wang et al., 2017, Zolanvari et al., 2018). These achievements have contributed to the flourishing of the study and development of 3D indoor object detection and classification for point clouds.
Many methods (Czerniawski et al., 2018, Günther et al., 2017, Mattausch et al., 2014, Nan et al., 2012, Valero et al., 2016, Li et al., 2018b, Verdoja et al., 2017) have been presented for the detection of tables, chairs and bookcases as indoor objects from scanned point clouds by various geometry-based means in cluttered indoor scenes (Mattausch et al., 2014, Wang et al., 2017). Despite the progress made by those methods, one inherent defect is that their successes and availability depend on the assumption that all indoor objects are oriented in an upward direction with respect to the ground or, more restrictively, perpendicular to the ground (Czerniawski et al., 2018, Günther et al., 2017, Mattausch et al., 2014, Nan et al., 2012, Tchapmi et al., 2017, Valero et al., 2016, Wang et al., 2017, Li et al., 2018b, Qi et al., 2017, Verdoja et al., 2017). For a complicated indoor scene or environment with non-upward directions in clutter and occlusion, those methods are incapable of fulfilling the tasks of detection and classification of indoor objects with oblique or even curved surfaces that violate the underlined assumption.
In addition to the abovementioned semantic methodology, substantial attention has recently been paid to the methodology of deep learning. As reviewed by Guo et al. (2020), the deep learning methodology, including multi-view-based (Tatarchenko et al., 2018, Su et al., 2015), voxel-based (Choy et al., 2019, Tchapmi et al., 2017) and point-based (Liang et al., 2019, Qi et al., 2020, Jiang et al., 2019, Li et al., 2018b, Qi et al., 2016, Qi et al., 2017, Qi et al., 2019), provides a very good mechanism with a great potential for semantically detecting and classifying indoor objects. However, its effectiveness comes from its fullness of training data. Only rich training data can produce satisfactory segmentations and classifications. For scenes with arbitrarily oriented objects, it is not easy to acquire all the necessary training data since objects with arbitrary orientations present a wide variety of discriminative features to be captured by this learning mechanism, where the varying orientation or pose of an object may generate a vastly different range of features. For these kinds of scenes, the semantic methodology will show its merit where semantically structures can be embedded in the process of detecting and classifying indoor objects with arbitrary orientations. Therefore, finding some structural and geometric features that are independent from the variation of object orientations warrants further exploration.
The graph-based semantic methodology has been proposed to overcome this notable limitation in the field of indoor modelling from point clouds (Shi et al., 2015, Spina, 2015, Wang et al., 2016). To represent indoor objects precisely and concisely, a graph approach based on functional parts (referring to anchors) (Spina, 2015, Wang et al., 2016) is proposed to capture indoor objects via a priori segmented patches, which describes one object as a graph formed by connecting anchors with other parts. As observed by (Fu et al., 2008, Laga et al., 2013, Nan et al., 2012, Wang et al., 2016), there is a strong correlation between the geometric shape and upward orientation between anchors in man-made objects. In light of such observations, the local coordinate system (LCS) in each graph can be refined by the anchor’s normal vector and position, which shows its reliability in classifying objects arbitrarily oriented in TLS point clouds. As the availability of those methods depends heavily on the extracted anchors, a prominent flaw is to extract anchors as planar primitives by the definition largely relying on their geometrically metric area. For example, a chair with two anchors (the seat and the back) in Fig. 1a may appear in the form of anchors with non-planar primitives as shown in Fig. 1b and anchors that are missing some parts as shown in Fig. 1c; both cases fail to extract the anchors. Furthermore, those methods assume that all legs of indoor chairs or tables can be segmented as cylinders, which is also sensitive to missing parts. As missing some parts of an indoor object in the scanned point clouds is the usual case and cannot be avoided, especially in the RGBD dataset (Dong et al., 2018), finding a method that can both accommodate non-upward directions and extract accurate anchors remains a critical issue in the field of detection and classification of indoor objects.
This paper presents an anchor-based graph method to detect and classify arbitrarily oriented indoor objects. The contribution of this study lies in the fact that anchors are extracted by the structurally adjacent relationship among parts in each object instead of from the parts’ geometric metric area and that the anchor-based graphs are matched by globally optimizing all pairing subgraphs in a super-graph with fully connecting nodes. The optimal matching based on the maximum likelihood outperforms the approximations achieved by many previous techniques (Armeni et al., 2016, Liang et al., 2019, Spina, 2015). By the adjacent relationship, an anchor can be correctly extracted even with missing parts since the adjacency between an anchor and other parts is retained irrespective of the area extent of the considered parts. The graph’s edges are reconstructed by connecting an anchor with its adjacent parts, which ensures the conciseness of the reconstructed graph.
The remainder of this paper is organized as follows. Related works are presented in Section 2. The proposed method of indoor object classification is described in Section 3, followed by experiments conducted with real-world datasets in Section 4, and an evaluation is provided in Section 5. Section 6 outlines the conclusions drawn from the previous discussions.
Section snippets
Related works
The current methods for object detection and classification can be classified into two groups: semantic methodology and deep learning methodology. The semantic methodology classifies indoor objects via pre-defined features of indoor objects, while the deep learning methodology implements object segmentation and classification by discriminative features learnt from the pre-labelled training samples.
Overview
Given a raw point set of an indoor scene, object detection and classification can be defined as segmenting and classifying indoor objects. The entire processing consists of five main steps: patch segmentation, graph reconstruction, approximate clustering via the anchor’s geometric shape, graph clustering via a super-graph and object refinement, as depicted in Fig. 2.
The input point clouds are first partitioned into a collection of patches using the efficient random sample consensus (RANSAC)
Experimental setup
The implementation details of the experiments, including the specification of benchmark datasets, evaluation criteria and parameter settings for our method, are described in this section. The algorithm was implemented by Point Cloud Library (PCL), CloudCompare and MATLAB. All experiments were performed on a 3.60 GHz Intel Core i7-4790 processor with 12 GB of RAM.
Performance comparison
To compare the performance of our proposed method with that of other state-of-the-art approaches, we examined two benchmark datasets (termed as “Bench I” in Fig. 9 and “Bench II” in Fig. 10) which were tested by other methods such as Nan et al., 2012, Mattausch et al., 2014, Wang et al., 2016 and the deep learning method, such as PointCNN (Li et al., 2018b) and VoteNet (Qi et al, 2019). The methods in Nan et al., 2012, Mattausch et al., 2014 were geometry-based methods, while the method in Wang
Conclusion
Current methods for the classification of 3D indoor objects from point clouds rely on the attributes extracted along the upright orientation and have shown obvious defects in terms of classifying objects with various poses without carefully acquired training data. To eliminate such deficiencies, this paper presented an anchor-based graph method capable of handling arbitrarily oriented objects and evaluated its performances on seven popular benchmark datasets. Comprehensive experiments
Funding
This study is funded by the National Natural Science Foundation of China (41871298, 42071366) and the National Key R&D Program of China (2017YFB0503701).
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
The authors would like to gratefully acknowledge Dr. Iro Armeni, Dr. Claudio Mura, Dr. Nan Liangliang, Dr. Axel Wendt, Dr. Rares Ambrus and Dr. Angela Dai for their help in accessing tested data.
References (62)
- et al.
6D DBSCAN-based segmentation of building point clouds for planar object classification
Autom. Constr.
(2018) - et al.
An efficient global energy optimization approach for robust 3D plane segmentation of point clouds
ISPRS J. Photogramm. Remote Sens.
(2018) - et al.
Model-based furniture recognition for building semantic object maps
Artif. Intell.
(2017) - et al.
Toward better boundary preserved supervoxel segmentation for 3D point clouds
ISPRS J. Photogramm. Remote Sens.
(2018) - et al.
Automatic reconstruction of fully volumetric 3D building models from oriented point clouds
ISPRS J. Photogramm. Remote Sens.
(2019) - et al.
Contextual segment-based classification of airborne laser scanner data
ISPRS J. Photogramm. Remote Sens.
(2017) - et al.
Cluttered indoor scene modeling via functional part-guided graph matching
Comput. Aided Geom. Des.
(2016) - et al.
A multi-scale fully convolutional network for semantic labeling of 3D point clouds
ISPRS J. Photogramm. Remote Sens.
(2018) - et al.
Three-dimensional building facade segmentation and opening area detection from point clouds
ISPRS J. Photogramm. Remote Sens.
(2018) Topology-varying 3D shape creation via structural blending
ACM Trans. Graphics
(2014)
Deformation-driven topology-varying 3D shape correspondence
ACM Trans. Graphics
Automatic Room Segmentation from Unstructured 3-D Data of Indoor Environments
IEEE Rob. Autom. Lett.
3D Semantic Parsing of Large-Scale Indoor Spaces, 2016 IEEE Conference on Computer Vision and Pattern Recognition
Johnny: An autonomous service robot for domestic environments
J. Intell. Rob. Syst.
ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes
Upright orientation of man-made objects
ACM Trans. Graphics
Object-Based Pose Graph for Dynamic Indoor Environments
IEEE Rob. Autom. Lett.
Deep Learning for 3D Point Clouds: A Survey
IEEE Trans. Pattern Anal. Mach. Intell.
Indoor Scene Understanding with RGB-D Images: Bottom-up Segmentation, Object Detection and Semantic Segmentation
Int. J. Comput. Vision
Energy-Based Geometric Multi-model Fitting
Int. J. Comput. Vision
A Review of Techniques for 3D Reconstruction of Indoor Environments
ISPRS Int. J. Geo-Inf
Using Spatial Relations of Objects in Real World Scenes for Scene Structuring and Scene Understanding
Geometry and context for semantic correspondences and functionality recognition in man-made 3D shapes
ACM Trans. Graphics
Object Recognition in 3D Point Clouds Using Web Data and Domain Adaptation
Int. J. Robot. Res.
A UWB-Based Indoor Positioning System Employing Neural Networks
J. Geovisualiz. Spat. Anal.
Reconstruction of Three-Dimensional (3D) Indoor Interiors with Multiple Stories via Comprehensive Segmentation
Remote Sens.
Cited by (4)
Completing point clouds using structural constraints for large-scale points absence in 3D building reconstruction
2023, ISPRS Journal of Photogrammetry and Remote SensingSlicing components guided indoor objects vectorized modeling from unilateral point cloud data
2022, DisplaysCitation Excerpt :Our work mainly focuses on point-based methods. Due to its disorder and unstructured of point cloud data, pioneering work PointNet [21] is proposed to conquer this by learning pointwise features using a shared multi-layer perception (MLP) and global features using symmetrical pooling functions [22]. On the basis of PointNet, Qi et al. [23] proposed PointNet++ to capture fine-grained patterns from the neighborhood of each point.
Boundary-Aware Supervoxel Segmentation for Indoor 3D Point Clouds
2023, IEEE AccessMD3D: Mixture-Density-Based 3D Object Detection in Point Clouds
2022, IEEE Access