当前位置: X-MOL 学术IEEE Trans. Image Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multi-Stage Network With Geometric Semantic Attention for Two-View Correspondence Learning
IEEE Transactions on Image Processing ( IF 10.6 ) Pub Date : 2024-04-24 , DOI: 10.1109/tip.2024.3391002
Shuyuan Lin 1 , Xiao Chen 1 , Guobao Xiao 2 , Hanzi Wang 3 , Feiran Huang 1 , Jian Weng 1
Affiliation  

The removal of outliers is crucial for establishing correspondence between two images. However, when the proportion of outliers reaches nearly 90%, the task becomes highly challenging. Existing methods face limitations in effectively utilizing geometric transformation consistency (GTC) information and incorporating geometric semantic neighboring information. To address these challenges, we propose a Multi-Stage Geometric Semantic Attention (MSGSA) network. The MSGSA network consists of three key modules: the multi-branch (MB) module, the GTC module, and the geometric semantic attention (GSA) module. The MB module, structured with a multi-branch design, facilitates diverse and robust spatial transformations. The GTC module captures transformation consistency information from the preceding stage. The GSA module categorizes input based on the prior stage’s output, enabling efficient extraction of geometric semantic information through a graph-based representation and inter-category information interaction using Transformer. Extensive experiments on the YFCC100M and SUN3D datasets demonstrate that MSGSA outperforms current state-of-the-art methods in outlier removal and camera pose estimation, particularly in scenarios with a high prevalence of outliers. Source code is available at https://github.com/shuyuanlin .

中文翻译:

用于双视图对应学习的具有几何语义关注的多级网络

去除异常值对于建立两个图像之间的对应关系至关重要。然而,当异常值的比例达到接近 90% 时,任务就变得极具挑战性。现有方法在有效利用几何变换一致性(GTC)信息和合并几何语义邻近信息方面面临局限性。为了应对这些挑战,我们提出了多阶段几何语义注意力(MSGSA)网络。 MSGSA网络由三个关键模块组成:多分支(MB)模块、GTC模块和几何语义注意(GSA)模块。 MB模块采用多分支设计,有利于多样化和稳健的空间转换。 GTC 模块捕获前一阶段的转换一致性信息。 GSA模块根据前一阶段的输出对输入进行分类,从而能够通过基于图的表示和使用Transformer的类别间信息交互来高效提取几何语义信息。对 YFCC100M 和 SUN3D 数据集的大量实验表明,MSGSA 在异常值去除和相机姿态估计方面优于当前最先进的方法,特别是在异常值普遍存在的场景中。源代码位于https://github.com/shuyuanlin
更新日期:2024-04-24
down
wechat
bug