当前位置: X-MOL 学术Telecommun. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multi-level graph convolutional recurrent neural network for semantic image segmentation
Telecommunication Systems ( IF 2.5 ) Pub Date : 2021-03-25 , DOI: 10.1007/s11235-021-00769-y
Dingchao Jiang , Hua Qu , Jihong Zhao , Jianlong Zhao , Wei Liang

With the advent of the Internet of Things (IoT) era, many devices have surfaced that capture and generate various visual data. To recognize and extract a meaningful pattern from these visual data, powerful methods are required for different IoT applications. Fortunately, deep convolutional neural networks (CNNs) significantly improve the performance of almost all tasks in computer vision, including semantic image segmentation. However, the feature extraction of CNNs may cause the loss of contextual and spatial information. Moreover, the standard convolutional and pooling layers adopted by most CNN architectures lead to a fixed receptive field, which makes it challenging to deal with multi-scale objects in the image. To remedy these issues of CNNs for semantic image segmentation, this paper proposes a multi-level graph convolutional recurrent neural network (MGCRNN) to combine CNNs and graph neural networks (GNNs) for fusing multi-level features. By applying graph convolutional recurrent neural network (GCRNN), the proposed model acquires a global view of the image and aggregates multi-level contextual and structural information. The experiments verify the ability of GCRNN to obtain a flexible receptive field and learn structure features without losing spatial information. Results of these experiments conducted on the Pascal VOC 2012 and Cityscapes datasets show that the proposed model outperforms baseline approaches and can be competitive with state-of-the-art methods



中文翻译:

多级图卷积递归神经网络的语义图像分割

随着物联网(IoT)时代的到来,捕获并生成各种可视数据的许多设备浮出水面。为了从这些可视数据中识别并提取有意义的模式,不同的物联网应用需要强大的方法。幸运的是,深度卷积神经网络(CNN)可以显着提高计算机视觉中几乎所有任务的性能,包括语义图像分割。但是,CNN的特征提取可能会导致上下文和空间信息的丢失。此外,大多数CNN架构采用的标准卷积和池化层会导致固定的接收场,这使其难以处理图像中的多尺度对象。为了解决CNN用于语义图像分割的这些问题,本文提出了一种多级图卷积递归神经网络(MGCRNN),将CNN和图神经网络(GNN)相结合,以融合多级特征。通过应用图卷积递归神经网络(GCRNN),提出的模型获得了图像的全局视图并聚合了多级上下文和结构信息。实验证明了GCRNN获得灵活的接收场和学习结构特征而不会丢失空间信息的能力。在Pascal VOC 2012和Cityscapes数据集上进行的这些实验的结果表明,所提出的模型优于基线方法,并且可以与最新方法竞争 通过应用图卷积递归神经网络(GCRNN),提出的模型获得了图像的全局视图,并聚合了多级上下文和结构信息。实验证明了GCRNN获得灵活的接收场和学习结构特征而不会丢失空间信息的能力。在Pascal VOC 2012和Cityscapes数据集上进行的这些实验的结果表明,所提出的模型优于基线方法,并且可以与最新方法竞争 通过应用图卷积递归神经网络(GCRNN),提出的模型获得了图像的全局视图并聚合了多级上下文和结构信息。实验证明了GCRNN获得灵活的接收场和学习结构特征而不会丢失空间信息的能力。在Pascal VOC 2012和Cityscapes数据集上进行的这些实验的结果表明,所提出的模型优于基线方法,并且可以与最新方法竞争

更新日期:2021-03-26
down
wechat
bug