Mask DnGAN: Multi-Stage Raw Video Denoising with Adversarial Loss and Gradient Mask,arXiv - CS - Computer Vision and Pattern Recognition

当前位置： X-MOL 学术 › arXiv.cs.CV › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Mask DnGAN: Multi-Stage Raw Video Denoising with Adversarial Loss and Gradient Mask
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2021-03-04 , DOI: arxiv-2103.02861
Avinash Paliwal, Libing Zeng, Nima Khademi Kalantari

In this paper, we propose a learning-based approach for denoising raw videos captured under low lighting conditions. We propose to do this by first explicitly aligning the neighboring frames to the current frame using a convolutional neural network (CNN). We then fuse the registered frames using another CNN to obtain the final denoised frame. To avoid directly aligning the temporally distant frames, we perform the two processes of alignment and fusion in multiple stages. Specifically, at each stage, we perform the denoising process on three consecutive input frames to generate the intermediate denoised frames which are then passed as the input to the next stage. By performing the process in multiple stages, we can effectively utilize the information of neighboring frames without directly aligning the temporally distant frames. We train our multi-stage system using an adversarial loss with a conditional discriminator. Specifically, we condition the discriminator on a soft gradient mask to prevent introducing high-frequency artifacts in smooth regions. We show that our system is able to produce temporally coherent videos with realistic details. Furthermore, we demonstrate through extensive experiments that our approach outperforms state-of-the-art image and video denoising methods both numerically and visually.

中文翻译：

Mask DnGAN：具有对抗损失和渐变蒙版的多阶段原始视频降噪

在本文中，我们提出了一种基于学习的方法来对低光照条件下捕获的原始视频进行降噪。我们建议通过使用卷积神经网络（CNN）首先将相邻帧与当前帧明确对齐来做到这一点。然后，我们使用另一个CNN融合已注册的帧，以获得最终的降噪帧。为了避免直接对齐时间上遥远的帧，我们在多个阶段执行对齐和融合这两个过程。具体来说，在每个阶段，我们对三个连续的输入帧执行去噪处理，以生成中间去噪帧，然后将其作为输入传递到下一个阶段。通过分多个阶段执行该过程，我们可以有效利用相邻帧的信息，而无需直接对齐时间上相距较远的帧。我们使用带有条件判别器的对抗损失来训练多阶段系统。具体而言，我们将鉴别器置于软梯度掩膜上，以防止在平滑区域引入高频伪像。我们证明了我们的系统能够产生具有真实细节的时间相干视频。此外，我们通过广泛的实验证明，我们的方法在数值和视觉上均优于最新的图像和视频去噪方法。

更新日期：2021-03-05

点击分享查看原文

点击收藏

阅读更多本刊最新论文