Temporally Coherent Video Harmonization Using Adversarial Networks.,IEEE Transactions on Image Processing

当前位置： X-MOL 学术 › IEEE Trans. Image Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Temporally Coherent Video Harmonization Using Adversarial Networks.
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 2019-07-17 , DOI: 10.1109/tip.2019.2925550
Hao-Zhi Huang , Sen-Zhe Xu , Jun-Xiong Cai , Wei Liu , Shi-Min Hu

Compositing is one of the most important editing operations for images and videos. The process of improving the realism of composite results is often called harmonization. Previous approaches for harmonization mainly focus on images. In this paper, we take one step further to attack the problem of video harmonization. Specifically, we train a convolutional neural network in an adversarial way, exploiting a pixel-wise disharmony discriminator to achieve more realistic harmonized results and introducing a temporal loss to increase temporal consistency between consecutive harmonized frames. Thanks to the pixel-wise disharmony discriminator, we are also able to relieve the need of input foreground masks. Since existing video datasets which have ground-truth foreground masks and optical flows are not sufficiently large, we propose a simple yet efficient method to build up a synthetic dataset supporting supervised training of the proposed adversarial network. The experiments show that training on our synthetic dataset generalizes well to the real-world composite dataset. In addition, our method successfully incorporates temporal consistency during training and achieves more harmonious visual results than previous methods.

中文翻译：

使用对抗网络的临时相干视频协调。

合成是图像和视频最重要的编辑操作之一。改善综合结果的真实性的过程通常称为协调。先前的协调方法主要集中在图像上。在本文中，我们将进一步采取措施来解决视频协调问题。具体来说，我们以对抗性的方式训练卷积神经网络，利用像素级不和谐鉴别器来实现更逼真的协调结果，并引入时间损失以提高连续协调帧之间的时间一致性。由于采用了像素级不和谐鉴别器，我们还可以减轻对输入前景蒙版的需求。由于现有的具有真实前景蒙版和光流的视频数据集不够大，我们提出了一种简单而有效的方法来建立综合数据集，以支持对所提议的对抗网络进行监督训练。实验表明，对我们的综合数据集进行训练可以很好地推广到现实世界的综合数据集。此外，我们的方法在训练过程中成功地融合了时间一致性，并且比以前的方法获得了更加和谐的视觉效果。

更新日期：2020-04-22

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11