当前位置: X-MOL 学术IEEE Trans. Image Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Deep Spatial Transformation for Pose-Guided Person Image Generation and Animation.
IEEE Transactions on Image Processing ( IF 10.8 ) Pub Date : 2020-08-27 , DOI: 10.1109/tip.2020.3018224
Yurui Ren , Ge Li , Shan Liu , Thomas H. Li

Pose-guided person image generation and animation aim to transform a source person image to target poses. These tasks require spatial manipulation of source data. However, Convolutional Neural Networks are limited by the lack of ability to spatially transform the inputs. In this article, we propose a differentiable global-flow local-attention framework to reassemble the inputs at the feature level. This framework first estimates global flow fields between sources and targets. Then, corresponding local source feature patches are sampled with content-aware local attention coefficients. We show that our framework can spatially transform the inputs in an efficient manner. Meanwhile, we further model the temporal consistency for the person image animation task to generate coherent videos. The experiment results of both image generation and animation tasks demonstrate the superiority of our model. Besides, additional results of novel view synthesis and face image animation show that our model is applicable to other tasks requiring spatial transformation. The source code of our project is available at https://github.com/RenYurui/Global-Flow-Local-Attention .

中文翻译:


用于姿势引导的人物图像生成和动画的深度空间变换。



姿势引导的人物图像生成和动画旨在将源人物图像转换为目标姿势。这些任务需要对源数据进行空间操作。然而,卷积神经网络由于缺乏对输入进行空间转换的能力而受到限制。在本文中,我们提出了一个可微的全局流局部注意框架来在特征级别重新组装输入。该框架首先估计源和目标之间的全局流场。然后,使用内容感知的局部注意系数对相应的局部源特征块进行采样。我们证明我们的框架可以有效地在空间上转换输入。同时,我们进一步对人物图像动画任务的时间一致性进行建模,以生成连贯的视频。图像生成和动画任务的实验结果证明了我们模型的优越性。此外,新颖的视图合成和面部图像动画的附加结果表明,我们的模型适用于需要空间变换的其他任务。我们项目的源代码位于https://github.com/RenYurui/Global-Flow-Local-Attention 。
更新日期:2020-09-05
down
wechat
bug