MGGR: MultiModal-Guided Gaze Redirection with Coarse-to-Fine Learning,arXiv - CS - Computer Vision and Pattern Recognition

当前位置： X-MOL 学术 › arXiv.cs.CV › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

MGGR: MultiModal-Guided Gaze Redirection with Coarse-to-Fine Learning
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2020-04-07 , DOI: arxiv-2004.03064
Jingjing Chen, Jichao Zhang, Jiayuan Fan, Tao Chen, Enver Sangineto, Nicu Sebe

Gaze redirection aims at manipulating a given eye gaze to a desirable direction according to a reference angle and it can be applied to many real life scenarios, such as video-conferencing or taking groups. However, the previous works suffer from two limitations: (1) low-quality generation and (2) low redirection precision. To this end, we propose an innovative MultiModal-Guided Gaze Redirection~(MGGR) framework that fully exploits eye-map images and target angles to adjust a given eye appearance through a designed coarse-to-fine learning. Our contribution is combining the flow-learning and adversarial learning for coarse-to-fine generation. More specifically, the role of the proposed coarse branch with flow field is to rapidly learn the spatial transformation for attaining the warped result with the desired gaze. The proposed fine-grained branch consists of a generator network with conditional residual image learning and a multi-task discriminator to reduce the gap between the warped image and the ground-truth image for recovering the finer texture details. Moreover, we propose leveraging the gazemap for desired angles as an extra guide to further improve the precision of gaze redirection. Extensive experiments on a benchmark dataset show that the proposed method outperforms the state-of-the-art methods in terms of image quality and redirection precision. Further evaluations demonstrate the effectiveness of the proposed coarse-to-fine and gazemap modules.

中文翻译：

MGGR：具有粗到细学习的多模态引导注视重定向

注视重定向旨在根据参考角度将给定的眼睛注视操纵到所需的方向，它可以应用于许多现实生活场景，例如视频会议或团体。然而，以前的作品有两个限制：（1）低质量的生成和（2）低重定向精度。为此，我们提出了一种创新的 MultiModal-Guided Gaze Redirection~(MGGR) 框架，该框架充分利用眼图图像和目标角度，通过设计的粗到细学习来调整给定的眼睛外观。我们的贡献是将流学习和对抗性学习结合起来进行粗到细的生成。更具体地说，所提出的具有流场的粗分支的作用是快速学习空间变换，以通过所需的凝视获得扭曲的结果。所提出的细粒度分支由具有条件残差图像学习的生成器网络和多任务鉴别器组成，以减少扭曲图像和真实图像之间的差距，以恢复更精细的纹理细节。此外，我们建议利用所需角度的凝视图作为额外指南，以进一步提高凝视重定向的精度。在基准数据集上的大量实验表明，所提出的方法在图像质量和重定向精度方面优于最先进的方法。进一步的评估证明了所提出的粗到细和凝视图模块的有效性。在基准数据集上的大量实验表明，所提出的方法在图像质量和重定向精度方面优于最先进的方法。进一步的评估证明了所提出的粗到细和凝视图模块的有效性。在基准数据集上的大量实验表明，所提出的方法在图像质量和重定向精度方面优于最先进的方法。进一步的评估证明了所提出的粗到细和凝视图模块的有效性。

更新日期：2020-05-21

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>