当前位置: X-MOL 学术IEEE Trans. Circ. Syst. Video Technol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Designing CNNs for Multimodal Image Restoration and Fusion via Unfolding the Method of Multipliers
IEEE Transactions on Circuits and Systems for Video Technology ( IF 8.3 ) Pub Date : 2022-03-30 , DOI: 10.1109/tcsvt.2022.3163649
Iman Marivani 1 , Evaggelia Tsiligianni 2 , Bruno Cornelis 1 , Nikos Deligiannis 1
Affiliation  

Multimodal, alias, guided, image restoration is the reconstruction of a degraded image from a target modality with the aid of a high quality image from another modality. A similar task is image fusion; it refers to merging images from different modalities into a composite image. Traditional approaches for multimodal image restoration and fusion include analytical methods that are computationally expensive at inference time. Recently developed deep learning methods have shown a great performance at a reduced computational cost; however, since these methods do not incorporate prior knowledge about the problem at hand, they result in a “black box” model, that is, one can hardly say what the model has learned. In this paper, we formulate multimodal image restoration and fusion as a coupled convolutional sparse coding problem, and adopt the Method of Multipliers (MM) for its solution. Then, we use the MM-based solution to design a convolutional neural network (CNN) encoder that follows the principle of deep unfolding. To address multimodal image restoration and fusion, we design two multimodal models which employ the proposed encoder followed by an appropriately designed decoder that maps the learned representations to the desired output. Unlike most existing deep learning designs comprising multiple encoding branches followed by a concatenation or a linear combination fusion block, the proposed design provides an efficient and structured way to fuse information at different stages of the network, providing representations that can lead to accurate image reconstruction. The proposed models are applied to three image restoration tasks, as well as two image fusion tasks. Quantitative and qualitative comparisons against various state-of-the-art analytical and deep learning methods corroborate the superior performance of the proposed framework.

中文翻译:


通过展开乘法器方法设计用于多模态图像恢复和融合的 CNN



多模态混叠引导图像恢复是借助来自另一种模态的高质量图像重建来自目标模态的退化图像。类似的任务是图像融合;它是指将来自不同模态的图像合并成合成图像。多模态图像恢复和融合的传统方法包括在推理时计算成本昂贵的分析方法。最近开发的深度学习方法在降低计算成本的情况下表现出了出色的性能;然而,由于这些方法没有结合有关当前问题的先验知识,因此它们导致了“黑盒”模型,也就是说,人们很难说出模型学到了什么。在本文中,我们将多模态图像恢复和融合表述为耦合卷积稀疏编码问题,并采用乘子法(MM)对其进行求解。然后,我们使用基于MM的解决方案来设计遵循深度展开原理的卷积神经网络(CNN)编码器。为了解决多模态图像恢复和融合问题,我们设计了两个多模态模型,它们采用所提出的编码器,然后是适当设计的解码器,将学习到的表示映射到所需的输出。与大多数现有的深度学习设计(包括多个编码分支,然后是串联或线性组合融合块)不同,所提出的设计提供了一种有效且结构化的方式来融合网络不同阶段的信息,提供可以导致准确图像重建的表示。所提出的模型应用于三个图像恢复任务以及两个图像融合任务。 与各种最先进的分析和深度学习方法的定量和定性比较证实了所提出的框架的优越性能。
更新日期:2022-03-30
down
wechat
bug