当前位置: X-MOL 学术IEEE Signal Process. Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Attention Aggregation Encoder-decoder Network Framework for Stereo Matching
IEEE Signal Processing Letters ( IF 3.9 ) Pub Date : 2020-01-01 , DOI: 10.1109/lsp.2020.2993776
Yaru Zhang , Yaqian Li , Yating Kong , Bin Liu

In the stereo matching networks based on deep learning, current cost aggregation networks lack the means to aggregate cost volume to the utmost extent. Therefore, different from the standard encoder-decoder structures, we propose an attention aggregation encoder-decoder network framework for stereo matching that contains three modules. Specifically, we design a sub-branch and cross-stage aggregation encoding module, which aggregate context information of different sub-branches and cross-stages to achieve the mutual utilization of different deep cost volumes. Meanwhile, we introduce a three-dimensional attention recoding module to obtain the robust discriminative cost volume through recalibrating the high-level semantic information of the sub-branches. In addition, we construct a stepwise aggregation decoding module to decode the cost volume via the stepwise fusion upsampling strategy, which further enhances the learning ability of the network model. The experimental results on Scene Flow and KITTI benchmark datasets show that the proposed network framework is superior to other similar methods in aggregating information.

中文翻译:

立体匹配的注意力聚合编码器-解码器网络框架

在基于深度学习的立体匹配网络中,当前的成本聚合网络缺乏最大限度聚合成本量的手段。因此,与标准的编码器-解码器结构不同,我们提出了一种用于立体匹配的注意力聚合编码器-解码器网络框架,该框架包含三个模块。具体来说,我们设计了一个子分支和跨阶段聚合编码模块,该模块聚合不同子分支和跨阶段的上下文信息,以实现不同深度成本量的相互利用。同时,我们引入了一个三维注意力重新编码模块,通过重新校准子分支的高级语义信息来获得鲁棒的判别成本量。此外,我们构建了逐步聚合解码模块,通过逐步融合上采样策略对成本量进行解码,进一步增强了网络模型的学习能力。Scene Flow 和 KITTI 基准数据集的实验结果表明,所提出的网络框架在聚合信息方面优于其他类似方法。
更新日期:2020-01-01
down
wechat
bug