Double Similarity Distillation for Semantic Image Segmentation,IEEE Transactions on Image Processing

当前位置： X-MOL 学术 › IEEE Trans. Image Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Double Similarity Distillation for Semantic Image Segmentation
IEEE Transactions on Image Processing ( IF 10.6 ) Pub Date : 2021-05-28 , DOI: 10.1109/tip.2021.3083113
Yingchao Feng , Xian Sun , Wenhui Diao , Jihao Li , Xin Gao

The balance between high accuracy and high speed has always been a challenging task in semantic image segmentation. Compact segmentation networks are more widely used in the case of limited resources, while their performances are constrained. In this paper, motivated by the residual learning and global aggregation, we propose a simple yet general and effective knowledge distillation framework called double similarity distillation (DSD) to improve the classification accuracy of all existing compact networks by capturing the similarity knowledge in pixel and category dimensions, respectively. Specifically, we propose a pixel-wise similarity distillation (PSD) module that utilizes residual attention maps to capture more detailed spatial dependencies across multiple layers. Compared with exiting methods, the PSD module greatly reduces the amount of calculation and is easy to expand. Furthermore, considering the differences in characteristics between semantic segmentation task and other computer vision tasks, we propose a category-wise similarity distillation (CSD) module, which can help the compact segmentation network strengthen the global category correlation by constructing the correlation matrix. Combining these two modules, DSD framework has no extra parameters and only a minimal increase in FLOPs. Extensive experiments on four challenging datasets, including Cityscapes, CamVid, ADE20K, and Pascal VOC 2012, show that DSD outperforms current state-of-the-art methods, proving its effectiveness and generality. The code and models will be publicly available.

中文翻译：

语义图像分割的双重相似性蒸馏

高精度和高速之间的平衡一直是语义图像分割中的一项具有挑战性的任务。紧凑的分割网络在资源有限的情况下得到更广泛的应用，但其性能受到限制。在本文中，受残差学习和全局聚合的启发，我们提出了一个简单而通用且有效的知识蒸馏框架，称为双相似蒸馏（DSD），通过捕获像素和类别中的相似性知识来提高所有现有紧凑网络的分类精度维度，分别。具体来说，我们提出了一个像素级相似度蒸馏 (PSD) 模块，该模块利用剩余注意力图来捕获跨多个层的更详细的空间依赖关系。与现有方法相比，PSD模块大大减少了计算量，易于扩展。此外，考虑到语义分割任务与其他计算机视觉任务在特征上的差异，我们提出了一个类别相似性蒸馏（CSD）模块，该模块可以通过构建相关矩阵来帮助紧凑分割网络加强全局类别相关性。结合这两个模块，DSD 框架没有额外的参数，只有极小的 FLOP 增加。对四个具有挑战性的数据集（包括 Cityscapes、CamVid、ADE20K 和 Pascal VOC 2012）进行的大量实验表明，DSD 优于当前最先进的方法，证明了其有效性和通用性。代码和模型将公开可用。考虑到语义分割任务与其他计算机视觉任务在特征上的差异，我们提出了一个类别相似性蒸馏（CSD）模块，它可以通过构建相关矩阵来帮助紧凑分割网络加强全局类别相关性。结合这两个模块，DSD 框架没有额外的参数，只有极小的 FLOP 增加。对四个具有挑战性的数据集（包括 Cityscapes、CamVid、ADE20K 和 Pascal VOC 2012）进行的大量实验表明，DSD 优于当前最先进的方法，证明了其有效性和通用性。代码和模型将公开可用。考虑到语义分割任务与其他计算机视觉任务在特征上的差异，我们提出了一个类别相似蒸馏（CSD）模块，它可以通过构建相关矩阵来帮助紧凑分割网络加强全局类别相关性。结合这两个模块，DSD 框架没有额外的参数，只有极小的 FLOP 增加。对四个具有挑战性的数据集（包括 Cityscapes、CamVid、ADE20K 和 Pascal VOC 2012）进行的大量实验表明，DSD 优于当前最先进的方法，证明了其有效性和通用性。代码和模型将公开可用。通过构建相关矩阵，可以帮助紧凑分割网络增强全局类别相关性。结合这两个模块，DSD 框架没有额外的参数，只有极小的 FLOP 增加。对四个具有挑战性的数据集（包括 Cityscapes、CamVid、ADE20K 和 Pascal VOC 2012）进行的大量实验表明，DSD 优于当前最先进的方法，证明了其有效性和通用性。代码和模型将公开可用。通过构建相关矩阵，可以帮助紧凑分割网络增强全局类别相关性。结合这两个模块，DSD 框架没有额外的参数，只有极小的 FLOP 增加。对四个具有挑战性的数据集（包括 Cityscapes、CamVid、ADE20K 和 Pascal VOC 2012）进行的大量实验表明，DSD 优于当前最先进的方法，证明了其有效性和通用性。代码和模型将公开可用。证明其有效性和普遍性。代码和模型将公开可用。证明其有效性和普遍性。代码和模型将公开可用。

更新日期：2021-06-04

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>