JoKDNet: A joint keypoint detection and description network for large-scale outdoor TLS point clouds registration

doi:10.1016/j.jag.2021.102534

International Journal of Applied Earth Observation and Geoinformation

Volume 104, 15 December 2021, 102534

https://doi.org/10.1016/j.jag.2021.102534 Get rights and content

Under a Creative Commons license

open access

Highlights

•
The registration provides complete data for monitoring of complex environment.
•
JoKDNet is the first focuses on large-scale outdoor TLS point clouds registration.
•
Keypoint detection and description are jointly learned to register point clouds.

Abstract

Registration of large-scale outdoor Terrestrial Laser Scanning (TLS) point clouds remains many challenges in the scenes with symmetric and repetitive elements (e.g., park, forest, and tunnel), the weak geometric features (e.g., underground excavation), and dramatically changes in different phases (e.g., mountain). To address these issues, a novel neural network JoKDNet is proposed to jointly learn the keypoint detection and feature description to improve the feasibility and accuracy of point clouds registration. Firstly, a novel keypoint detection module is introduced to automatically learn the score of each sampled point and regard the most significant Top-k sampled points as the detected keypoints. Secondly, an enhanced feature description module is proposed to learn the feature representation of each keypoint by fusing the hierarchical local features and context features. Thirdly, a loss function is designed to make the detected keypoints more distinguishable for matching, which simultaneously maximizes the feature distance between non-corresponding keypoints and minimizes the feature distance of corresponding keypoints. Finally, the distance matrix module and RANdom SAmple Consensus (RANSAC) are utilized to determine the correspondences of source and target point clouds for the transformation calculation. Comprehensive experiments show that the JoKDNet performs effectively on five challenging scenes (e.g., park, forest, tunnel, underground excavation, and mountain) from two datasets (WHU-TLS and ETH-TLS) in terms of registration errors, and robustness to varying scenes, with the maximum rotation error less than 0.06° and maximum translation error less than 0.84 m without ICP.

Keywords

Point clouds

Registration

3D deep learning

Keypoint detection

Feature descriptor