Fine-grained facial landmark detection exploiting intermediate feature representations,Computer Vision and Image Understanding

当前位置： X-MOL 学术 › Comput. Vis. Image Underst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Fine-grained facial landmark detection exploiting intermediate feature representations
Computer Vision and Image Understanding ( IF 4.3 ) Pub Date : 2020-07-08 , DOI: 10.1016/j.cviu.2020.103036
Yongzhe Yan , Stefan Duffner , Priyanka Phutane , Anthony Berthelier , Xavier Naturel , Christophe Blanc , Christophe Garcia , Thierry Chateau

Facial landmark detection has been an active research subject over the last decade. In this paper, we present a new approach for Fine-grained Facial Landmark Detection (FFLD) improving on the precision of the detected points. A high spatial precision of facial landmarks is crucial for many applications related to aesthetic rendering, such as face modeling, face animation, virtual make-up, etc. In this paper, we present an approach that improves the detection precision. Since most facial landmarks are positioned on visible boundary lines, we train a model that encourages the detected landmarks to stay on these boundaries. Our proposed Convolutional Neural Networks (CNN) effectively exploits lower-level feature maps containing abundant boundary information. To this end, beside the main CNN predicting facial landmark positions, we use several additional components, called CropNets. CropNet receives patches cropped from feature maps at different stages of this CNN, and estimate fine corrections of its predicted positions. We also introduce a novel robust spatial loss function based on pixel-wise differences between patches cropped from predicted and ground-truth positions. To further improve the landmark localization, our framework uses several loss functions optimizing the precision at several stages in different ways. Extensive experiments show that our framework significantly increases the local precision of state-of-the-art deep coordinate regression models.

中文翻译：

利用中间特征表示的细粒度面部界标检测

在过去十年中，人脸标志检测一直是活跃的研究主题。在本文中，我们提出了一种新的细粒度面部地标检测（FFLD）方法，以提高检测点的精度。面部标志的高空间精度对于与美学渲染相关的许多应用至关重要，例如面部建模，面部动画，虚拟化妆等。在本文中，我们提出了一种提高检测精度的方法。由于大多数面部地标都位于可见边界线上，因此我们训练了一个模型，该模型鼓励检测到的地标留在这些边界上。我们提出的卷积神经网络（CNN）有效地利用了包含大量边界信息的低级特征图。为此，除了主要的CNN预测面部标志位置之外，我们使用了几个其他组件，称为CropNets。CropNet在此CNN的不同阶段接收从特征图裁剪的补丁，并估计其预测位置的精细校正。我们还介绍了一种新颖的鲁棒的空间损失函数，该函数基于从预测位置和真实位置裁剪的补丁之间的像素差异。为了进一步改善地标定位，我们的框架使用了多个损失函数，以不同的方式在多个阶段优化了精度。大量的实验表明，我们的框架显着提高了最新的深度坐标回归模型的局部精度。我们还介绍了一种新颖的鲁棒的空间损失函数，该函数基于从预测位置和真实位置裁剪的补丁之间的像素差异。为了进一步改善地标定位，我们的框架使用了多个损失函数，以不同的方式在多个阶段优化了精度。大量的实验表明，我们的框架显着提高了最新的深度坐标回归模型的局部精度。我们还介绍了一种新颖的鲁棒的空间损失函数，该函数基于从预测位置和真实位置裁剪的补丁之间的像素差异。为了进一步改善地标定位，我们的框架使用了多个损失函数，以不同的方式在多个阶段优化了精度。大量的实验表明，我们的框架显着提高了最新的深度坐标回归模型的局部精度。

更新日期：2020-07-16

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11