当前位置: X-MOL 学术Ind. Rob. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An end-to-End deep context gate convolutional visual odometry system based on lightweight attention mechanism
Industrial Robot ( IF 1.8 ) Pub Date : 2021-09-13 , DOI: 10.1108/ir-01-2021-0019
Yan Xu 1 , Hong Qin 1 , Jiani Huang 1 , Yanyun Wang 1
Affiliation  

Purpose

Conventional learning-based visual odometry (VO) systems usually use convolutional neural networks (CNN) to extract features, where some important context-related and attention-holding global features might be ignored. Without essential global features, VO system will be sensitive to various environmental perturbations. The purpose of this paper is to design a novel learning-based framework that aims to improve accuracy of learning-based VO without decreasing the generalization ability.

Design/methodology/approach

Instead of CNN, a context-gated convolution is adopted to build an end-to-end learning framework, which enables convolutional layers that dynamically capture representative local patterns and composes local features of interest under the guidance of global context. In addition, an attention mechanism module is introduced to further improve learning ability and enhance robustness and generalization ability of the VO system.

Findings

The proposed system is evaluated on the public data set KITTI and the self-collected data sets of our college building, where it shows competitive performance compared with some classical and state-of-the-art learning-based methods. Quantitative experimental results on the public data set KITTI show that compared with CNN-based VO methods, the average translational error and rotational error of all the test sequences are reduced by 45.63% and 37.22%, respectively.

Originality/value

The main contribution of this paper is that an end-to-end deep context gate convolutional VO system based on lightweight attention mechanism is proposed, which effectively improves the accuracy compared with other learning-based methods.



中文翻译:

基于轻量级注意力机制的端到端深度上下文门卷积视觉里程计系统

目的

传统的基于学习的视觉里程计 (VO) 系统通常使用卷积神经网络 (CNN) 来提取特征,其中一些重要的上下文相关和保持注意力的全局特征可能会被忽略。如果没有必要的全局特征,VO 系统将对各种环境扰动敏感。本文的目的是设计一种新颖的基于学习的框架,旨在在不降低泛化能力的情况下提高基于学习的 VO 的准确性。

设计/方法/方法

代替CNN,采用上下文门控卷积来构建端到端的学习框架,该框架使卷积层能够在全局上下文的指导下动态捕获代表性局部模式并组合感兴趣的局部特征。此外,还引入了注意力机制模块,以进一步提高学习能力,增强 VO 系统的鲁棒性和泛化能力。

发现

所提出的系统在公共数据集 KITTI 和我们大学建筑的自收集数据集上进行了评估,与一些经典和最先进的基于学习的方法相比,它显示出具有竞争力的性能。在公共数据集 KITTI 上的定量实验结果表明,与基于 CNN 的 VO 方法相比,所有测试序列的平均平移误差和旋转误差分别降低了 45.63% 和 37.22%。

原创性/价值

本文的主要贡献是提出了一种基于轻量级注意力机制的端到端深度上下文门卷积VO系统,与其他基于学习的方法相比,有效提高了准确率。

更新日期:2021-09-13
down
wechat
bug