An anchor-based graph method for detecting and classifying indoor objects from cluttered 3D point clouds

doi:10.1016/j.isprsjprs.2020.12.007

ISPRS Journal of Photogrammetry and Remote Sensing

Volume 172, February 2021, Pages 114-131

https://doi.org/10.1016/j.isprsjprs.2020.12.007 Get rights and content

Abstract

Most of the existing 3D indoor object classification methods have shown impressive achievements on the assumption that all objects are oriented in the upward direction with respect to the ground. To release this assumption, great effort has been made to handle arbitrarily oriented objects in terrestrial laser scanning (TLS) point clouds. As one of the most promising solutions, anchor-based graphs can be used to classify freely oriented objects. However, this approach suffers from missing anchor detection since valid detection relies heavily on the completeness of an anchor’s point clouds and is sensitive to missing data. This paper presents an anchor-based graph method to detect and classify arbitrarily oriented indoor objects. The anchors of each object are extracted by the structurally adjacent relationship among parts instead of the parts’ geometric metrics. In the case of adjacency, an anchor can be correctly extracted even with missing parts since the adjacency between an anchor and other parts is retained irrespective of the area extent of the considered parts. The best graph matching is achieved by finding the optimal corresponding node-pairs in a super-graph with fully connecting nodes based on maximum likelihood. The performances of the proposed method are evaluated with three indicators (object precision, object recall and object F1-score) in seven datasets. The experimental tests demonstrate the effectiveness of dealing with TLS point clouds, RGBD point clouds and Panorama RGBD point clouds, resulting in performance scores of approximately 0.8 for object precision and recall and over 0.9 for chair precision and table recall.

Introduction

Fast and stable detection and classification of indoor objects from scanned point clouds have been instrumental in many applications, such as autonomous vehicles (Mattausch et al., 2014, Meyer et al., 2019, Naseer et al., 2018), indoor reconstruction (Kang et al., 2020, Li et al., 2018a, Sharif et al., 2017, Wang et al., 2017), robotics (Breuer et al., 2011, Li et al., 2020) and city planning (Rui et al., 2018, Vosselman et al., 2017, Yousefhussien et al., 2018). Moreover, recent advances in scanning technology greatly accelerate data acquisition (Gupta et al., 2015, Rui et al., 2018, Yulan et al., 2014) and improve the accuracy of the scanned point cloud (Mattausch et al., 2014, Ochmann et al., 2019, Wang et al., 2017, Zolanvari et al., 2018). These achievements have contributed to the flourishing of the study and development of 3D indoor object detection and classification for point clouds.

Many methods (Czerniawski et al., 2018, Günther et al., 2017, Mattausch et al., 2014, Nan et al., 2012, Valero et al., 2016, Li et al., 2018b, Verdoja et al., 2017) have been presented for the detection of tables, chairs and bookcases as indoor objects from scanned point clouds by various geometry-based means in cluttered indoor scenes (Mattausch et al., 2014, Wang et al., 2017). Despite the progress made by those methods, one inherent defect is that their successes and availability depend on the assumption that all indoor objects are oriented in an upward direction with respect to the ground or, more restrictively, perpendicular to the ground (Czerniawski et al., 2018, Günther et al., 2017, Mattausch et al., 2014, Nan et al., 2012, Tchapmi et al., 2017, Valero et al., 2016, Wang et al., 2017, Li et al., 2018b, Qi et al., 2017, Verdoja et al., 2017). For a complicated indoor scene or environment with non-upward directions in clutter and occlusion, those methods are incapable of fulfilling the tasks of detection and classification of indoor objects with oblique or even curved surfaces that violate the underlined assumption.

In addition to the abovementioned semantic methodology, substantial attention has recently been paid to the methodology of deep learning. As reviewed by Guo et al. (2020), the deep learning methodology, including multi-view-based (Tatarchenko et al., 2018, Su et al., 2015), voxel-based (Choy et al., 2019, Tchapmi et al., 2017) and point-based (Liang et al., 2019, Qi et al., 2020, Jiang et al., 2019, Li et al., 2018b, Qi et al., 2016, Qi et al., 2017, Qi et al., 2019), provides a very good mechanism with a great potential for semantically detecting and classifying indoor objects. However, its effectiveness comes from its fullness of training data. Only rich training data can produce satisfactory segmentations and classifications. For scenes with arbitrarily oriented objects, it is not easy to acquire all the necessary training data since objects with arbitrary orientations present a wide variety of discriminative features to be captured by this learning mechanism, where the varying orientation or pose of an object may generate a vastly different range of features. For these kinds of scenes, the semantic methodology will show its merit where semantically structures can be embedded in the process of detecting and classifying indoor objects with arbitrary orientations. Therefore, finding some structural and geometric features that are independent from the variation of object orientations warrants further exploration.

The graph-based semantic methodology has been proposed to overcome this notable limitation in the field of indoor modelling from point clouds (Shi et al., 2015, Spina, 2015, Wang et al., 2016). To represent indoor objects precisely and concisely, a graph approach based on functional parts (referring to anchors) (Spina, 2015, Wang et al., 2016) is proposed to capture indoor objects via a priori segmented patches, which describes one object as a graph formed by connecting anchors with other parts. As observed by (Fu et al., 2008, Laga et al., 2013, Nan et al., 2012, Wang et al., 2016), there is a strong correlation between the geometric shape and upward orientation between anchors in man-made objects. In light of such observations, the local coordinate system (LCS) in each graph can be refined by the anchor’s normal vector and position, which shows its reliability in classifying objects arbitrarily oriented in TLS point clouds. As the availability of those methods depends heavily on the extracted anchors, a prominent flaw is to extract anchors as planar primitives by the definition largely relying on their geometrically metric area. For example, a chair with two anchors (the seat and the back) in Fig. 1a may appear in the form of anchors with non-planar primitives as shown in Fig. 1b and anchors that are missing some parts as shown in Fig. 1c; both cases fail to extract the anchors. Furthermore, those methods assume that all legs of indoor chairs or tables can be segmented as cylinders, which is also sensitive to missing parts. As missing some parts of an indoor object in the scanned point clouds is the usual case and cannot be avoided, especially in the RGBD dataset (Dong et al., 2018), finding a method that can both accommodate non-upward directions and extract accurate anchors remains a critical issue in the field of detection and classification of indoor objects.

This paper presents an anchor-based graph method to detect and classify arbitrarily oriented indoor objects. The contribution of this study lies in the fact that anchors are extracted by the structurally adjacent relationship among parts in each object instead of from the parts’ geometric metric area and that the anchor-based graphs are matched by globally optimizing all pairing subgraphs in a super-graph with fully connecting nodes. The optimal matching based on the maximum likelihood outperforms the approximations achieved by many previous techniques (Armeni et al., 2016, Liang et al., 2019, Spina, 2015). By the adjacent relationship, an anchor can be correctly extracted even with missing parts since the adjacency between an anchor and other parts is retained irrespective of the area extent of the considered parts. The graph’s edges are reconstructed by connecting an anchor with its adjacent parts, which ensures the conciseness of the reconstructed graph.

The remainder of this paper is organized as follows. Related works are presented in Section 2. The proposed method of indoor object classification is described in Section 3, followed by experiments conducted with real-world datasets in Section 4, and an evaluation is provided in Section 5. Section 6 outlines the conclusions drawn from the previous discussions.

Section snippets

Related works

The current methods for object detection and classification can be classified into two groups: semantic methodology and deep learning methodology. The semantic methodology classifies indoor objects via pre-defined features of indoor objects, while the deep learning methodology implements object segmentation and classification by discriminative features learnt from the pre-labelled training samples.

Overview

Given a raw point set of an indoor scene, object detection and classification can be defined as segmenting and classifying indoor objects. The entire processing consists of five main steps: patch segmentation, graph reconstruction, approximate clustering via the anchor’s geometric shape, graph clustering via a super-graph and object refinement, as depicted in Fig. 2.

The input point clouds are first partitioned into a collection of patches using the efficient random sample consensus (RANSAC)

Experimental setup

The implementation details of the experiments, including the specification of benchmark datasets, evaluation criteria and parameter settings for our method, are described in this section. The algorithm was implemented by Point Cloud Library (PCL), CloudCompare and MATLAB. All experiments were performed on a 3.60 GHz Intel Core i7-4790 processor with 12 GB of RAM.

Performance comparison

To compare the performance of our proposed method with that of other state-of-the-art approaches, we examined two benchmark datasets (termed as “Bench I” in Fig. 9 and “Bench II” in Fig. 10) which were tested by other methods such as Nan et al., 2012, Mattausch et al., 2014, Wang et al., 2016 and the deep learning method, such as PointCNN (Li et al., 2018b) and VoteNet (Qi et al, 2019). The methods in Nan et al., 2012, Mattausch et al., 2014 were geometry-based methods, while the method in Wang

Conclusion

Current methods for the classification of 3D indoor objects from point clouds rely on the attributes extracted along the upright orientation and have shown obvious defects in terms of classifying objects with various poses without carefully acquired training data. To eliminate such deficiencies, this paper presented an anchor-based graph method capable of handling arbitrarily oriented objects and evaluated its performances on seven popular benchmark datasets. Comprehensive experiments

Funding

This study is funded by the National Natural Science Foundation of China (41871298, 42071366) and the National Key R&D Program of China (2017YFB0503701).

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The authors would like to gratefully acknowledge Dr. Iro Armeni, Dr. Claudio Mura, Dr. Nan Liangliang, Dr. Axel Wendt, Dr. Rares Ambrus and Dr. Angela Dai for their help in accessing tested data.

References (62)

T. Czerniawski et al.
6D DBSCAN-based segmentation of building point clouds for planar object classification
Autom. Constr.
(2018)
Z. Dong et al.
An efficient global energy optimization approach for robust 3D plane segmentation of point clouds
ISPRS J. Photogramm. Remote Sens.
(2018)
M. Günther et al.
Model-based furniture recognition for building semantic object maps
Artif. Intell.
(2017)
Y. Lin et al.
Toward better boundary preserved supervoxel segmentation for 3D point clouds
ISPRS J. Photogramm. Remote Sens.
(2018)
S. Ochmann et al.
Automatic reconstruction of fully volumetric 3D building models from oriented point clouds
ISPRS J. Photogramm. Remote Sens.
(2019)
G. Vosselman et al.
Contextual segment-based classification of airborne laser scanner data
ISPRS J. Photogramm. Remote Sens.
(2017)
J. Wang et al.
Cluttered indoor scene modeling via functional part-guided graph matching
Comput. Aided Geom. Des.
(2016)
M. Yousefhussien et al.
A multi-scale fully convolutional network for semantic labeling of 3D point clouds
ISPRS J. Photogramm. Remote Sens.
(2018)
S.M.I. Zolanvari et al.
Three-dimensional building facade segmentation and opening area detection from point clouds
ISPRS J. Photogramm. Remote Sens.
(2018)
I. Alhashim
Topology-varying 3D shape creation via structural blending
ACM Trans. Graphics
(2014)

I. Alhashim

Deformation-driven topology-varying 3D shape correspondence

ACM Trans. Graphics

(2015)

R. Ambrus et al.

Automatic Room Segmentation from Unstructured 3-D Data of Indoor Environments

IEEE Rob. Autom. Lett.

(2017)

I. Armeni

3D Semantic Parsing of Large-Scale Indoor Spaces, 2016 IEEE Conference on Computer Vision and Pattern Recognition

(2016)

Armeni, I., Sax, S., Zamir, A.R. and Savarese, S., 2017. Joint 2D-3D-Semantic Data for Indoor Scene Understanding....

T. Breuer

Johnny: An autonomous service robot for domestic environments

J. Intell. Rob. Syst.

(2011)

Choy, C., Gwak, J. and Savarese, S., 2019. 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks. 2019...

A. Dai

ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes

H. Fu et al.

Upright orientation of man-made objects

ACM Trans. Graphics

(2008)

C. Gomez et al.

Object-Based Pose Graph for Dynamic Indoor Environments

IEEE Rob. Autom. Lett.

(2020)

Y. Guo

Deep Learning for 3D Point Clouds: A Survey

IEEE Trans. Pattern Anal. Mach. Intell.

(2020)

S. Gupta et al.

Indoor Scene Understanding with RGB-D Images: Bottom-up Segmentation, Object Detection and Semantic Segmentation

Int. J. Comput. Vision

(2015)

Ikehata, S., Yang, H., Furukawa, Y., 2015. Structured Indoor Modeling, Proceedings of the IEEE International Conference...

H. Isack et al.

Energy-Based Geometric Multi-model Fitting

Int. J. Comput. Vision

(2012)

Jiang, L., Zhao, H., Liu, S., Shen, X., Fu, C.W., Jia, J., 2019. Hierarchical point-edge interaction network for point...

Z. Kang et al.

A Review of Techniques for 3D Reconstruction of Indoor Environments

ISPRS Int. J. Geo-Inf

(2020)

A. Kasper et al.

Using Spatial Relations of Objects in Real World Scenes for Scene Structuring and Scene Understanding

H. Laga et al.

Geometry and context for semantic correspondences and functionality recognition in man-made 3D shapes

ACM Trans. Graphics

(2013)

K. Lai et al.

Object Recognition in 3D Point Clouds Using Web Data and Domain Adaptation

Int. J. Robot. Res.

(2010)

B. Li et al.

A UWB-Based Indoor Positioning System Employing Neural Networks

J. Geovisualiz. Spat. Anal.

(2020)

L. Li

Reconstruction of Three-Dimensional (3D) Indoor Interiors with Multiple Stories via Comprehensive Segmentation

Remote Sens.

(2018)

Li, Y., et al., 2018. PointCNN: Convolution On X-Transformed Points. arXiv:...

Cited by (4)

Completing point clouds using structural constraints for large-scale points absence in 3D building reconstruction
2023, ISPRS Journal of Photogrammetry and Remote Sensing
The completion of point cloud directly affects the accuracy of primitive parameter extraction and 3D reconstruction. Due to the limitations of sensor scanning, point cloud are often incomplete due to occlusions during the data collection. To solve the problem of incomplete building point clouds and to support geometric detail LoD2 reconstruction, a point cloud completion method with structural constraint is proposed in this paper. First, based on the idea of cloth simulation, regularity and planarity of the building were used to impose structural constraints on the cloth nodes. Combined with the rigidity of the building, the force rules of nodes under the distribution of the building structure are formulated. Then, the sinking and rebounding stages of the node were simulated to recover the position of point cloud in the area of interest. Finally, the missing surface in each façade direction was completed by attitude adjustment. The solution of a multi-layer structural façade was realized by the selection of parameters during the sinking and rebounding stage. The feasibility and accuracy of the proposed point cloud completion method have been verified through experiments. It is demonstrated that the proposed method can improve the integrity of a building point cloud by achieving more than 90% coverage. Compared with existing methods, the proposed method can deal with the problem of large-scale structure missing from buildings with fewer limitations. The patched points do not rely on the geometric primitives’ extraction and are based on the distribution of existing points without redundant observations, so the patched points adapt better to the existing building points. The proposed method has enhanced the utilization of point clouds and improved the detail of LoD2 reconstruction. It provides a new feasible method for point cloud completion. The code of the method is available at https://github.com/bufanzhao/point-cloud-completion.git.
Slicing components guided indoor objects vectorized modeling from unilateral point cloud data
2022, Displays
Citation Excerpt :
Our work mainly focuses on point-based methods. Due to its disorder and unstructured of point cloud data, pioneering work PointNet [21] is proposed to conquer this by learning pointwise features using a shared multi-layer perception (MLP) and global features using symmetrical pooling functions [22]. On the basis of PointNet, Qi et al. [23] proposed PointNet++ to capture fine-grained patterns from the neighborhood of each point.
Lightweight representation of 3D scene objects is helpful for the effective operation of low-power mobile hardware platforms such as robots and unmanned vehicles. In this paper, we propose a slicing guidance approach efficiently convert the unilateral cloud data of indoor scene into a collection of vector models to reduce the storage space of the scene map. Specifically, we first extract the shape components of the indoor scene by a progressive algorithm based on the cross-section slicing. Then, our approach classify the different primitive shape components according to the curvature of their boundary points, and the primitive shape components are fitted respectively to compensate for the lack parts of original data. Finally, we present a scoring mechanism for component recognition and matching to generate the geometrically faithful model to the input indoor scene. Experimental results demonstrate that our method has better performance in detail description, recognition and vectorized model matching of the unilateral point cloud than existing related methods.
Boundary-Aware Supervoxel Segmentation for Indoor 3D Point Clouds
2023, IEEE Access
MD3D: Mixture-Density-Based 3D Object Detection in Point Clouds
2022, IEEE Access

View full text

An anchor-based graph method for detecting and classifying indoor objects from cluttered 3D point clouds

Abstract

Introduction

Section snippets

Related works

Overview

Experimental setup

Performance comparison

Conclusion

Funding

Declaration of Competing Interest

Acknowledgments

Autom. Constr.

ISPRS J. Photogramm. Remote Sens.

Artif. Intell.

ISPRS J. Photogramm. Remote Sens.

ISPRS J. Photogramm. Remote Sens.

ISPRS J. Photogramm. Remote Sens.

Comput. Aided Geom. Des.

ISPRS J. Photogramm. Remote Sens.

ISPRS J. Photogramm. Remote Sens.

Topology-varying 3D shape creation via structural blending

ACM Trans. Graphics

Deformation-driven topology-varying 3D shape correspondence

ACM Trans. Graphics

Automatic Room Segmentation from Unstructured 3-D Data of Indoor Environments

IEEE Rob. Autom. Lett.

3D Semantic Parsing of Large-Scale Indoor Spaces, 2016 IEEE Conference on Computer Vision and Pattern Recognition

Johnny: An autonomous service robot for domestic environments

J. Intell. Rob. Syst.

ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes

Upright orientation of man-made objects

ACM Trans. Graphics

Object-Based Pose Graph for Dynamic Indoor Environments

IEEE Rob. Autom. Lett.

Deep Learning for 3D Point Clouds: A Survey

IEEE Trans. Pattern Anal. Mach. Intell.

Indoor Scene Understanding with RGB-D Images: Bottom-up Segmentation, Object Detection and Semantic Segmentation

Int. J. Comput. Vision

Energy-Based Geometric Multi-model Fitting

Int. J. Comput. Vision

A Review of Techniques for 3D Reconstruction of Indoor Environments

ISPRS Int. J. Geo-Inf

Using Spatial Relations of Objects in Real World Scenes for Scene Structuring and Scene Understanding

Geometry and context for semantic correspondences and functionality recognition in man-made 3D shapes

ACM Trans. Graphics

Object Recognition in 3D Point Clouds Using Web Data and Domain Adaptation

Int. J. Robot. Res.

A UWB-Based Indoor Positioning System Employing Neural Networks

J. Geovisualiz. Spat. Anal.

Reconstruction of Three-Dimensional (3D) Indoor Interiors with Multiple Stories via Comprehensive Segmentation

Remote Sens.