FCSS: Fully Convolutional Self-Similarity for Dense Semantic Correspondence,IEEE Transactions on Pattern Analysis and Machine Intelligence

当前位置： X-MOL 学术 › IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

FCSS: Fully Convolutional Self-Similarity for Dense Semantic Correspondence
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 23.6 ) Pub Date : 2018-02-07 , DOI: 10.1109/tpami.2018.2803169
Seungryong Kim , Dongbo Min , Bumsub Ham , Stephen Lin , Kwanghoon Sohn

We present a descriptor, called fully convolutional self-similarity (FCSS), for dense semantic correspondence. Unlike traditional dense correspondence approaches for estimating depth or optical flow, semantic correspondence estimation poses additional challenges due to intra-class appearance and shape variations among different instances within the same object or scene category. To robustly match points across semantically similar images, we formulate FCSS using local self-similarity (LSS), which is inherently insensitive to intra-class appearance variations. LSS is incorporated through a proposed convolutional self-similarity (CSS) layer, where the sampling patterns and the self-similarity measure are jointly learned in an end-to-end and multi-scale manner. Furthermore, to address shape variations among different object instances, we propose a convolutional affine transformer (CAT) layer that estimates explicit affine transformation fields at each pixel to transform the sampling patterns and corresponding receptive fields. As training data for semantic correspondence is rather limited, we propose to leverage object candidate priors provided in most existing datasets and also correspondence consistency between object pairs to enable weakly-supervised learning. Experiments demonstrate that FCSS significantly outperforms conventional handcrafted descriptors and CNN-based descriptors on various benchmarks.

中文翻译：

FCSS：密集语义对应的完全卷积自相似

我们为密集的语义对应关系提供了一个称为完全卷积自相似性（FCSS）的描述符。与用于估计深度或光流的传统密集对应方法不同，由于在同一对象或场景类别内不同实例之间的类内外观和形状变化，语义对应估计带来了其他挑战。为了在语义上相似的图像之间可靠地匹配点，我们使用局部自相似度（LSS）来制定FCSS，该局部自相似度对类内外观变化不敏感。通过提议的卷积自相似度（CSS）层合并了LSS，其中以端到端和多尺度的方式共同学习采样模式和自相似度。此外，为了解决不同对象实例之间的形状变化，我们提出了卷积仿射变换器（CAT）层，该层估计每个像素处的显式仿射变换场，以变换采样模式和相应的接收场。由于语义对应关系的训练数据非常有限，我们建议利用大多数现有数据集中提供的对象候选先验以及对象对之间的对应一致性以实现弱监督学习。实验表明，FCSS在各种基准上明显优于传统的手工描述符和基于CNN的描述符。我们建议利用大多数现有数据集中提供的对象候选先验，以及对象对之间的对应一致性，以实现弱监督学习。实验表明，FCSS在各种基准上明显优于传统的手工描述符和基于CNN的描述符。我们建议利用大多数现有数据集中提供的对象候选先验，以及对象对之间的对应一致性，以实现弱监督学习。实验表明，FCSS在各种基准上明显优于传统的手工描述符和基于CNN的描述符。

更新日期：2019-02-06

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>