High-quality Frame Recurrent Video De-raining with Multi-contextual Adversarial Network,ACM Transactions on Multimedia Computing, Communications, and Applications

当前位置： X-MOL 学术 › ACM Trans. Multimed. Comput. Commun. Appl. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

High-quality Frame Recurrent Video De-raining with Multi-contextual Adversarial Network
ACM Transactions on Multimedia Computing, Communications, and Applications ( IF 5.2 ) Pub Date : 2021-05-12 , DOI: 10.1145/3444974
Prasen Kumar Sharma ₁ , Sujoy Ghosh ₁ , Arijit Sur ₁

Affiliation

In this article, we address the problem of rain-streak removal in the videos. Unlike the image, challenges in video restoration comprise temporal consistency besides spatial enhancement. The researchers across the world have proposed several effective methods for estimating the de-noised videos with outstanding temporal consistency. However, such methods also amplify the computational cost due to their larger size. By way of analysis, incorporating separate modules for spatial and temporal enhancement may require more computational resources. It motivates us to propose a unified architecture that directly estimates the de-rained frame with maximal visual quality and minimal computational cost. To this end, we present a deep learning-based Frame-recurrent Multi-contextual Adversarial Network for rain-streak removal in videos. The proposed model is built upon a Conditional Generative Adversarial Network (CGAN)-based framework where the generator model directly estimates the de-rained frame from the previously estimated one with the help of its multi-contextual adversary. To optimize the proposed model, we have incorporated the Perceptual loss function in addition to the conventional Euclidean distance. Also, instead of traditional entropy loss from the adversary, we propose to use the Euclidean distance between the features of de-rained and clean frames, extracted from the discriminator model as a cost function for video de-raining. Various experimental observations across 11 test sets, with over 10 state-of-the-art methods, using 14 image-quality metrics, prove the efficacy of the proposed work, both visually and computationally.

中文翻译：

多上下文对抗网络的高质量帧循环视频去雨

在本文中，我们解决了视频中去除雨痕的问题。与图像不同，视频恢复的挑战除了空间增强之外还包括时间一致性。世界各地的研究人员提出了几种有效的方法来估计具有出色时间一致性的去噪视频。然而，这些方法也由于其较大的尺寸而放大了计算成本。通过分析，合并用于空间和时间增强的单独模块可能需要更多计算资源。它促使我们提出一个统一的架构，以最大的视觉质量和最小的计算成本直接估计去雨帧。为此，我们提出了一种基于深度学习的帧循环多上下文对抗网络用于去除视频中的雨痕。所提出的模型建立在基于条件生成对抗网络 (CGAN) 的框架上，其中生成器模型在其多上下文对手的帮助下直接从先前估计的框架估计去雨帧。为了优化所提出的模型，除了传统的欧几里得距离外，我们还加入了感知损失函数。此外，我们建议使用从鉴别器模型中提取的去雨帧和干净帧的特征之间的欧几里得距离作为视频去雨的成本函数，而不是传统的来自对手的熵损失。跨 11 个测试集的各种实验观察，采用 10 多种最先进的方法，使用 14 个图像质量指标，证明了所提出的工作在视觉和计算上的有效性。

更新日期：2021-05-12

点击分享查看原文

点击收藏

阅读更多本刊最新论文