2D–3D Geometric Fusion network using Multi-Neighbourhood Graph Convolution for RGB-D indoor scene classification,Information Fusion

当前位置： X-MOL 学术 › Inform. Fusion › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

2D–3D Geometric Fusion network using Multi-Neighbourhood Graph Convolution for RGB-D indoor scene classification
Information Fusion ( IF 14.7 ) Pub Date : 2021-05-14 , DOI: 10.1016/j.inffus.2021.05.002
Albert Mosella-Montoro , Javier Ruiz-Hidalgo

Multi-modal fusion has been proved to help enhance the performance of scene classification tasks. This paper presents a 2D–3D Fusion stage that combines 3D Geometric Features with 2D Texture Features obtained by 2D Convolutional Neural Networks. To get a robust 3D Geometric embedding, a network that uses two novel layers is proposed. The first layer, Multi-Neighbourhood Graph Convolution, aims to learn a more robust geometric descriptor of the scene combining two different neighbourhoods: one in the Euclidean space and the other in the Feature space. The second proposed layer, Nearest Voxel Pooling, improves the performance of the well-known Voxel Pooling. Experimental results, using NYU-Depth-V2 and SUN RGB-D datasets, show that the proposed method outperforms the current state-of-the-art in RGB-D indoor scene classification task.

中文翻译：

使用多邻域图卷积的2D–3D几何融合网络用于RGB-D室内场景分类

事实证明，多模式融合有助于增强场景分类任务的性能。本文介绍了将3D几何特征与2D卷积神经网络获得的2D纹理特征相结合的2D–3D融合阶段。为了获得鲁棒的3D几何嵌入，提出了使用两个新颖层的网络。第一层是“多邻域图卷积”，旨在学习结合两个不同邻域的场景的更健壮的几何描述符：一个在欧几里得空间中，另一个在特征空间中。提议的第二层，最近的体素池化，提高了众所周知的体素池化的性能。使用NYU-Depth-V2和SUN RGB-D数据集进行的实验结果表明，所提出的方法在RGB-D室内场景分类任务中优于当前的最新技术。

更新日期：2021-05-19

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11