当前位置: X-MOL 学术IEEE Trans. Neural Netw. Learn. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Fully Deformable Network for Multiview Face Image Synthesis.
IEEE Transactions on Neural Networks and Learning Systems ( IF 10.4 ) Pub Date : 2022-11-07 , DOI: 10.1109/tnnls.2022.3216018
Cheng Xu 1 , Keke Li 1 , Xuandi Luo 2 , Xuemiao Xu 1 , Shengfeng He 3 , Kun Zhang 4
Affiliation  

Photorealistic multiview face synthesis from a single image is a challenging problem. Existing works mainly learn a texture mapping model from the source to the target faces. However, they rarely consider the geometric constraints on the internal deformation arising from pose variations, which causes a high level of uncertainty in face pose modeling, and hence, produces inferior results for large pose variations. Moreover, current methods typically suffer from undesired facial details loss due to the adoption of the de-facto standard encoder-decoder architecture without any skip connections (SCs). In this article, we directly learn and exploit geometric constraints and propose a fully deformable network to simultaneously model the deformations of both landmarks and faces for face synthesis. Specifically, our model consists of two parts: a deformable landmark learning network (DLLN) and a gated deformable face synthesis network (GDFSN). The DLLN converts an initial reference landmark to an individual-specific target landmark as delicate pose guidance for face rotation. The GDFSN adopts a dual-stream structure, with one stream estimating the deformation of two views in the form of convolution offsets according to the source pose and the converted target pose, and the other leveraging the predicted deformation offsets to create the target face. In this way, individual-aware pose changes are explicitly modeled in the face generator to cope with geometric transformation, by adaptively focusing on pertinent regions of the source face. To compensate for offset estimation errors, we introduce a soft-gating mechanism for adaptive fusion between deformable features and primitive features. Additionally, a pose-aligned SC (PASC) is tailored to propagate low-level input features to the appropriate positions in the output features for further enhancing the facial details and identity preservation. Extensive experiments on six benchmarks show that our approach performs favorably against the state-of-the-arts, especially with large pose changes. Code is available at https://github.com/cschengxu/FDFace.

中文翻译:

用于多视图人脸图像合成的完全可变形网络。

从单个图像进行逼真的多视图人脸合成是一个具有挑战性的问题。现有的工作主要是学习一个从源面到目标面的纹理映射模型。然而,他们很少考虑由姿势变化引起的内部变形的几何约束,这导致面部姿势建模的高度不确定性,因此对于较大的姿势变化产生较差的结果。此外,由于采用了没有任何跳过连接 (SC) 的事实上的标准编码器-解码器架构,当前的方法通常会遭受不希望的面部细节损失。在本文中,我们直接学习和利用几何约束,并提出了一个完全可变形的网络来同时对地标和人脸的变形进行建模以进行人脸合成。具体来说,我们的模型由两部分组成:可变形地标学习网络(DLLN)和门控可变形人脸合成网络(GDFSN)。DLLN 将初始参考地标转换为特定于个人的目标地标,作为面部旋转的精细姿势指导。GDFSN采用双流结构,一个流根据源姿态和转换后的目标姿态以卷积偏移的形式估计两个视图的变形,另一个利用预测的变形偏移来创建目标人脸。通过这种方式,通过自适应地关注源面部的相关区域,在面部生成器中显式地对个体感知的姿势变化进行建模,以应对几何变换。为了补偿偏移估计误差,我们引入了一种软门机制,用于可变形特征和原始特征之间的自适应融合。此外,姿势对齐的 SC (PASC) 被定制为将低级输入特征传播到输出特征中的适当位置,以进一步增强面部细节和身份保存。在六个基准上进行的大量实验表明,我们的方法与最先进的技术相比表现良好,尤其是在姿势变化较大的情况下。代码可在 https://github.com/cschengxu/FDFace 获得。
更新日期:2022-11-07
down
wechat
bug