Matching wide-baseline stereo images with weak texture using the perspective invariant local feature transformer,Journal of Applied Remote Sensing

当前位置： X-MOL 学术 › J. Appl. Remote Sens. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Matching wide-baseline stereo images with weak texture using the perspective invariant local feature transformer
Journal of Applied Remote Sensing ( IF 1.4 ) Pub Date : 2022-07-01 , DOI: 10.1117/1.jrs.16.036502
Guobiao Yao ₁ , Pengfei Huang ₁ , Haibin Ai ₂ , Chuanhui Zhang ₁ , Jin Zhang ₁ , Chengcheng Zhang ₁ , Fuyao Wang ₁

Affiliation

The development of remote sensing sensor techniques allows us to now readily capture many types of indoor and outdoor scene images, which often include many weak texture regions with notable geometric distortions. Obtaining qualified matches from these difficult stereo images using existing methods is challenging. The recent achievements of deep-learning models have shown that the convolutional neural network (CNN) is adept at the image matching task. However, in practical applications, the following challenges remain: first, it is difficult to detect features in the weak texture regions of an image, and existing CNNs fail to extract discriminative image information from the quantized features of weak texture; second, as a result of the complex distortion across wide-baseline stereo images, it is difficult to match feature primitives detected in the image pair. To solve these problems, we propose the perspective invariant local feature transformer (PILFT) algorithm. Our method includes four main steps. (1) The affine scale-invariant feature transform is proposed to automatically extract the corresponding features from images, and then the perspective of the matched image is corrected to eliminate as much geometric deformation as possible. (2) The residual network is used to extract potential features from stereo images to obtain coarse and fine feature maps at different scales. (3) Using an attention mechanism, location and context information are added to the coarse level features, which are predicted by a dual-softmax function layer. (4) The features are precisely predicted on the fine feature map using the coarse reference, and the final matching results are determined by calculating the matching probability. A large number of experiments on wide-baseline weak texture images demonstrate that the proposed method has advantages over the existing algorithms in the number of matches, correct match rate, and matching accuracy. The pseudocodes of PILFT are available at https://github.com/KiltAB/PILFT.

中文翻译：

使用透视不变局部特征变换器匹配具有弱纹理的宽基线立体图像

遥感传感器技术的发展使我们现在可以轻松捕获多种类型的室内和室外场景图像，这些图像通常包括许多具有显着几何失真的弱纹理区域。使用现有方法从这些困难的立体图像中获得合格的匹配是具有挑战性的。深度学习模型的最新成就表明，卷积神经网络（CNN）擅长图像匹配任务。然而，在实际应用中，仍然存在以下挑战：首先，难以检测图像弱纹理区域中的特征，现有的 CNN 无法从弱纹理的量化特征中提取判别图像信息；其次，由于宽基线立体图像的复杂失真，很难匹配在图像对中检测到的特征图元。为了解决这些问题，我们提出了透视不变局部特征变换（PILFT）算法。我们的方法包括四个主要步骤。(1) 提出仿射尺度不变特征变换，自动从图像中提取相应特征，然后对匹配图像进行透视校正，尽可能消除几何变形。(2)残差网络用于从立体图像中提取潜在特征，得到不同尺度的粗细特征图。(3) 使用注意机制，将位置和上下文信息添加到粗级特征中，这些特征由双 softmax 函数层进行预测。(4) 使用粗略参考在精细特征图上精确预测特征，通过计算匹配概率确定最终匹配结果。在宽基线弱纹理图像上进行的大量实验表明，该方法在匹配数量、正确匹配率和匹配精度等方面均优于现有算法。PILFT 的伪代码可在 https://github.com/KiltAB/PILFT 获得。

更新日期：2022-07-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11