Deep, Landmark-Free FAME: Face Alignment, Modeling, and Expression Estimation,International Journal of Computer Vision

当前位置： X-MOL 学术 › Int. J. Comput. Vis. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Deep, Landmark-Free FAME: Face Alignment, Modeling, and Expression Estimation
International Journal of Computer Vision ( IF 19.5 ) Pub Date : 2019-02-13 , DOI: 10.1007/s11263-019-01151-x
Feng-Ju Chang , Anh Tuan Tran , Tal Hassner , Iacopo Masi , Ram Nevatia , Gérard Medioni

We present a novel method for modeling 3D face shape, viewpoint, and expression from a single, unconstrained photo. Our method uses three deep convolutional neural networks to estimate each of these components separately. Importantly, unlike others, our method does not use facial landmark detection at test time; instead, it estimates these properties directly from image intensities. In fact, rather than using detectors, we show how accurate landmarks can be obtained as a by-product of our modeling process. We rigorously test our proposed method. To this end, we raise a number of concerns with existing practices used in evaluating face landmark detection methods. In response to these concerns, we propose novel paradigms for testing the effectiveness of rigid and non-rigid face alignment methods without relying on landmark detection benchmarks. We evaluate rigid face alignment by measuring its effects on face recognition accuracy on the challenging IJB-A and IJB-B benchmarks. Non-rigid, expression estimation is tested on the CK+ and EmotiW’17 benchmarks for emotion classification. We do, however, report the accuracy of our approach as a landmark detector for 3D landmarks on AFLW2000-3D and 2D landmarks on 300W and AFLW-PIFA. A surprising conclusion of these results is that better landmark detection accuracy does not necessarily translate to better face processing. Parts of this paper were previously published by Tran et al. (2017) and Chang et al. (2017, 2018).

中文翻译：

Deep, Landmark-Free FAME：人脸对齐、建模和表情估计

我们提出了一种新方法，可从一张无约束的照片中对 3D 人脸形状、视点和表情进行建模。我们的方法使用三个深度卷积神经网络来分别估计这些组件中的每一个。重要的是，与其他方法不同，我们的方法在测试时不使用面部标志检测；相反，它直接从图像强度估计这些属性。事实上，我们没有使用检测器，而是展示了如何作为建模过程的副产品获得准确的地标。我们严格测试我们提出的方法。为此，我们对用于评估面部标志检测方法的现有实践提出了一些担忧。针对这些问题，我们提出了新的范例来测试刚性和非刚性人脸对齐方法的有效性，而不依赖于地标检测基准。我们通过在具有挑战性的 IJB-A 和 IJB-B 基准测试中测量其对人脸识别准确性的影响来评估刚性人脸对齐。非刚性的表达估计在 CK+ 和 EmotiW'17 的情感分类基准上进行了测试。然而，我们确实报告了我们的方法作为 AFLW2000-3D 上的 3D 地标和 300W 和 AFLW-PIFA 上的 2D 地标的地标检测器的准确性。这些结果令人惊讶的结论是，更好的地标检测精度并不一定转化为更好的面部处理。本文的部分内容先前由 Tran 等人发表。(2017) 和 Chang 等人。（2017 年，2018 年）。表达估计在 CK+ 和 EmotiW'17 情感分类基准上进行了测试。然而，我们确实报告了我们的方法作为 AFLW2000-3D 上的 3D 地标和 300W 和 AFLW-PIFA 上的 2D 地标的地标检测器的准确性。这些结果的一个令人惊讶的结论是，更好的地标检测精度并不一定转化为更好的面部处理。本文的部分内容先前由 Tran 等人发表。(2017) 和 Chang 等人。（2017 年，2018 年）。表达估计在 CK+ 和 EmotiW'17 情感分类基准上进行了测试。然而，我们确实报告了我们的方法作为 AFLW2000-3D 上的 3D 地标和 300W 和 AFLW-PIFA 上的 2D 地标的地标检测器的准确性。这些结果的一个令人惊讶的结论是，更好的地标检测精度并不一定转化为更好的面部处理。本文的部分内容先前由 Tran 等人发表。(2017) 和 Chang 等人。（2017 年，2018 年）。(2017) 和 Chang 等人。（2017 年，2018 年）。(2017) 和 Chang 等人。（2017 年，2018 年）。

更新日期：2019-02-13

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>