ORDNet: Capturing Omni-Range Dependencies for Scene Parsing.,IEEE Transactions on Image Processing

当前位置： X-MOL 学术 › IEEE Trans. Image Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

ORDNet: Capturing Omni-Range Dependencies for Scene Parsing.
IEEE Transactions on Image Processing ( IF 10.6 ) Pub Date : 2020-08-05 , DOI: 10.1109/tip.2020.3013142
Shaofei Huang , Si Liu , Tianrui Hui , Jizhong Han , Bo Li , Jiashi Feng , Shuicheng Yan

Learning to capture dependencies between spatial positions is essential to many visual tasks, especially the dense labeling problems like scene parsing. Existing methods can effectively capture long-range dependencies with self-attention mechanism while short ones by local convolution. However, there is still much gap between long-range and short-range dependencies, which largely reduces the models’ flexibility in application to diverse spatial scales and relationships in complicated natural scene images. To fill such a gap, we develop a Middle-Range (MR) branch to capture middle-range dependencies by restricting self-attention into local patches. Also, we observe that the spatial regions which have large correlations with others can be emphasized to exploit long-range dependencies more accurately, and thus propose a Reweighed Long-Range (RLR) branch. Based on the proposed MR and RLR branches, we build an Omni-Range Dependencies Network (ORDNet) which can effectively capture short-, middle- and long-range dependencies. Our ORDNet is able to extract more comprehensive context information and well adapt to complex spatial variance in scene images. Extensive experiments show that our proposed ORDNet outperforms previous state-of-the-art methods on three scene parsing benchmarks including PASCAL Context, COCO Stuff and ADE20K, demonstrating the superiority of capturing omni-range dependencies in deep models for scene parsing task.

中文翻译：

ORDNet：捕获场景解析的全方位依赖。

学习捕获空间位置之间的依存关系对于许多视觉任务至关重要，尤其是诸如场景解析之类的密集标注问题。现有的方法可以通过自我注意机制有效地捕获远程依赖关系，而通过局部卷积可以有效地捕获远程依赖关系。但是，远程和短期依赖之间仍然有很大的差距，这大大降低了模型在应用于复杂的自然场景图像中的各种空间尺度和关系时的灵活性。为了填补这一空白，我们开发了一个中范围（MR）分支，通过将自我注意限制在局部区域中来捕获中等范围的依赖性。此外，我们发现可以强调与其他区域具有较大相关性的空间区域，以便更准确地利用远程依存关系，并因此建议重称远程（RLR）分支。基于拟议的MR和RLR分支，我们构建了可以有效捕获短，中和远程依赖关系的全方位依赖关系网络（ORDNet）。我们的ORDNet能够提取更全面的上下文信息，并很好地适应场景图像中的复杂空间变化。大量实验表明，我们提出的ORDNet在三个场景解析基准（包括PASCAL Context，COCO Stuff和ADE20K）上优于以前的最新方法，证明了在深度模型中捕获全范围依赖项以进行场景解析任务的优越性。我们的ORDNet能够提取更全面的上下文信息，并很好地适应场景图像中的复杂空间变化。大量实验表明，我们提出的ORDNet在三个场景解析基准（包括PASCAL Context，COCO Stuff和ADE20K）上优于以前的最新方法，证明了在深度模型中捕获全范围依赖项以进行场景解析任务的优越性。我们的ORDNet能够提取更全面的上下文信息，并很好地适应场景图像中的复杂空间变化。大量实验表明，我们提出的ORDNet在三个场景解析基准（包括PASCAL Context，COCO Stuff和ADE20K）上优于以前的最新方法，证明了在深度模型中捕获全范围依赖项以进行场景解析任务的优越性。

更新日期：2020-08-14

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>