Extrapolation-Based Video Retargeting with Backward Warping Using an Image-to-Warping Vector Generation Network,IEEE Signal Processing Letters

当前位置： X-MOL 学术 › IEEE Signal Process. Lett. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Extrapolation-Based Video Retargeting with Backward Warping Using an Image-to-Warping Vector Generation Network
IEEE Signal Processing Letters ( IF 3.2 ) Pub Date : 2020-01-01 , DOI: 10.1109/lsp.2020.2977206
Sung In Cho , Suk-Ju Kang

Video retargeting is a technique used to transform a given video to a target aspect ratio. Current methods often cause severe visual distortion due to frequent temporal incoherence during the retargeting. In this study, we propose a new extrapolation-based video retargeting method using an image-to-warping vector generation network to maintain temporal coherence and prevent deformation of an input frame by extending the side area of an input frame. Backward warping-based extrapolation is performed using a displacement vector (DV) that is generated by a proposed convolutional neural network (CNN). The DV is defined as the displacement between the current hole to be filled in the extended area and a pixel in the input frame used to fill the hole. We also propose a technique to efficiently train the CNN including a method for ground-truth DV generation. After the extrapolation, we propose a technique for the maintenance of temporal coherence of the extended region and a distortion suppression scheme (DSC) for minimizing visual artifacts. The simulation results demonstrated that the proposed method improved bidirectional similarity (BDS) up to 3.69, which is a measure of the quality of video retargeting, compared with existing video retargeting methods.

中文翻译：

使用图像到扭曲向量生成网络的后向扭曲的基于外推的视频重定向

视频重定向是一种用于将给定视频转换为目标纵横比的技术。由于重新定位期间频繁的时间不连贯性，当前的方法通常会导致严重的视觉失真。在这项研究中，我们提出了一种新的基于外推的视频重定向方法，该方法使用图像到扭曲矢量生成网络来保持时间一致性并通过扩展输入帧的侧区域来防止输入帧变形。使用由提出的卷积神经网络 (CNN) 生成的位移矢量 (DV) 执行基于向后翘曲的外推。DV 定义为扩展区域中要填充的当前空洞与用于填充空洞的输入帧中的像素之间的位移。我们还提出了一种有效训练 CNN 的技术，包括一种生成真实 DV 的方法。在外推之后，我们提出了一种用于维持扩展区域的时间相干性的技术和一种用于最小化视觉伪影的失真抑制方案 (DSC)。仿真结果表明，与现有的视频重定向方法相比，所提出的方法将双向相似度（BDS）提高到 3.69，这是衡量视频重定向质量的一个指标。

更新日期：2020-01-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11