Scene Segmentation with DAG-Recurrent Neural Networks,IEEE Transactions on Pattern Analysis and Machine Intelligence

当前位置： X-MOL 学术 › IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Scene Segmentation with DAG-Recurrent Neural Networks
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 23.6 ) Pub Date : 2017-06-06 , DOI: 10.1109/tpami.2017.2712691
Bing Shuai , Zhen Zuo , Bing Wang , Gang Wang

In this paper, we address the challenging task of scene segmentation. In order to capture the rich contextual dependencies over image regions, we propose Directed Acyclic Graph-Recurrent Neural Networks (DAG-RNN) to perform context aggregation over locally connected feature maps. More specifically, DAG-RNN is placed on top of pre-trained CNN (feature extractor) to embed context into local features so that their representative capability can be enhanced. In comparison with plain CNN (as in Fully Convolutional Networks-FCN), DAG-RNN is empirically found to be significantly more effective at aggregating context. Therefore, DAG-RNN demonstrates noticeably performance superiority over FCNs on scene segmentation. Besides, DAG-RNN entails dramatically less parameters as well as demands fewer computation operations, which makes DAG-RNN more favorable to be potentially applied on resource-constrained embedded devices. Meanwhile, the class occurrence frequencies are extremely imbalanced in scene segmentation, so we propose a novel class-weighted loss to train the segmentation network. The loss distributes reasonably higher attention weights to infrequent classes during network training, which is essential to boost their parsing performance. We evaluate our segmentation network on three challenging public scene segmentation benchmarks: Sift Flow, Pascal Context and COCO Stuff. On top of them, we achieve very impressive segmentation performance.

中文翻译：

DAG-递归神经网络的场景分割

在本文中，我们解决了场景分割的艰巨任务。为了捕获图像区域上丰富的上下文相关性，我们提出了有向非循环图递归神经网络（DAG-RNN），以对本地连接的特征图执行上下文聚合。更具体地说，将DAG-RNN放置在预先训练的CNN（特征提取器）之上，以将上下文嵌入局部特征中，从而可以增强其代表性功能。与普通的CNN相比（如在完全卷积网络-FCN中），根据经验发现DAG-RNN在聚合上下文方面更为有效。因此，DAG-RNN在场景分割方面表现出明显优于FCN的性能。此外，DAG-RNN所需的参数大大减少，所需的计算操作更少，这使得DAG-RNN更适合潜在地应用于资源受限的嵌入式设备。同时，在场景分割中类的出现频率极不平衡，因此我们提出了一种新颖的类加权损失来训练分割网络。这种损失会在网络培训期间为不频繁的班级分配合理的较高注意权重，这对于提高其解析性能至关重要。我们根据三个具有挑战性的公共场景细分基准评估细分网络：筛分流程，帕斯卡上下文和可可原料。最重要的是，我们实现了非常出色的细分效果。因此，我们提出了一种新颖的类别加权损失来训练细分网络。这种损失会在网络培训期间为不频繁的班级分配合理的较高注意权重，这对于提高其解析性能至关重要。我们根据三个具有挑战性的公共场景细分基准评估细分网络：筛分流程，帕斯卡上下文和可可原料。最重要的是，我们实现了非常出色的细分效果。因此，我们提出了一种新颖的类别加权损失来训练细分网络。这种损失会在网络培训期间为不频繁的班级分配合理的较高注意权重，这对于提高其解析性能至关重要。我们根据三个具有挑战性的公共场景细分基准评估细分网络：筛分流程，帕斯卡上下文和可可原料。最重要的是，我们实现了非常出色的细分效果。

更新日期：2018-05-05

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>