Normalized Avatar Synthesis Using StyleGAN and Perceptual Refinement,arXiv - CS - Graphics

当前位置： X-MOL 学术 › arXiv.cs.GR › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Normalized Avatar Synthesis Using StyleGAN and Perceptual Refinement
arXiv - CS - Graphics Pub Date : 2021-06-21 , DOI: arxiv-2106.11423
Huiwen Luo, Koki Nagano, Han-Wei Kung, Mclean Goldwhite, Qingguo Xu, Zejian Wang, Lingyu Wei, Liwen Hu, Hao Li

We introduce a highly robust GAN-based framework for digitizing a normalized 3D avatar of a person from a single unconstrained photo. While the input image can be of a smiling person or taken in extreme lighting conditions, our method can reliably produce a high-quality textured model of a person's face in neutral expression and skin textures under diffuse lighting condition. Cutting-edge 3D face reconstruction methods use non-linear morphable face models combined with GAN-based decoders to capture the likeness and details of a person but fail to produce neutral head models with unshaded albedo textures which is critical for creating relightable and animation-friendly avatars for integration in virtual environments. The key challenges for existing methods to work is the lack of training and ground truth data containing normalized 3D faces. We propose a two-stage approach to address this problem. First, we adopt a highly robust normalized 3D face generator by embedding a non-linear morphable face model into a StyleGAN2 network. This allows us to generate detailed but normalized facial assets. This inference is then followed by a perceptual refinement step that uses the generated assets as regularization to cope with the limited available training samples of normalized faces. We further introduce a Normalized Face Dataset, which consists of a combination photogrammetry scans, carefully selected photographs, and generated fake people with neutral expressions in diffuse lighting conditions. While our prepared dataset contains two orders of magnitude less subjects than cutting edge GAN-based 3D facial reconstruction methods, we show that it is possible to produce high-quality normalized face models for very challenging unconstrained input images, and demonstrate superior performance to the current state-of-the-art.

中文翻译：

使用 StyleGAN 和感知细化的标准化头像合成

我们引入了一个高度健壮的基于 GAN 的框架，用于从一张无约束的照片中数字化一个人的标准化 3D 头像。虽然输入图像可以是一个微笑的人，也可以是在极端光照条件下拍摄的，但我们的方法可以可靠地生成一个高质量的人脸纹理模型，在漫射光照条件下具有中性表情和皮肤纹理。尖端的 3D 人脸重建方法使用非线性可变形人脸模型结合基于 GAN 的解码器来捕捉人的肖像和细节，但无法生成具有无阴影反照率纹理的中性头部模型，这对于创建可重新照明和动画友好的至关重要用于在虚拟环境中集成的化身。现有方法的主要挑战是缺乏包含归一化 3D 人脸的训练和地面实况数据。我们提出了一个两阶段的方法来解决这个问题。首先，我们通过将非线性可变形人脸模型嵌入到 StyleGAN2 网络中来采用高度稳健的归一化 3D 人脸生成器。这使我们能够生成详细但标准化的面部资产。这个推理之后是一个感知细化步骤，该步骤使用生成的资产作为正则化来应对归一化人脸的有限可用训练样本。我们进一步介绍了归一化人脸数据集，它包括组合摄影测量扫描、精心挑选的照片，并在漫射照明条件下生成具有中性表情的假人。虽然我们准备的数据集包含的主题比前沿的基于 GAN 的 3D 面部重建方法少两个数量级，

更新日期：2021-06-25

点击分享查看原文

点击收藏

阅读更多本刊最新论文