Video-based person-dependent and person-independent facial emotion recognition,Signal, Image and Video Processing

当前位置： X-MOL 学术 › Signal Image Video Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Video-based person-dependent and person-independent facial emotion recognition
Signal, Image and Video Processing ( IF 2.3 ) Pub Date : 2021-01-19 , DOI: 10.1007/s11760-020-01830-0
Noushin Hajarolasvadi , Enver Bashirov , Hasan Demirel

Facial emotion recognition is a challenging problem that has attracted the attention of researchers in the last decade. In this paper, we present a system for facial emotion recognition in video sequences. Then, we evaluate the system for a person-dependent and person-independent cases. Depending on the purpose of the designed system, the importance of training a personalized model versus a non-personalized one differs. In this paper, first, we compute 60 geometric features for video frames of two datasets, namely RML and SAVEE databases. In the next step, k -means clustering is applied to the geometric features to select k most discriminant frames for each video clip. Then, we employ various classifiers like linear support vector machine (SVM) and Gaussian SVM to find the best representative k . Finally, five pre-trained convolutional neural networks, namely VGG-16, VGG-19, ResNet-50, AlexNet, and GoogleNet, were used evaluating two scenarios: person-dependent and person-independent emotion recognition. Additionally, the effect of geometric features in keyframe selection for a person-dependent and person-independent scenarios is studied based on different regions of the face. Also, the extracted features by CNNs are visualized using the t -distributed stochastic neighbor embedding algorithm to study the discriminative ability in these scenarios. Experiments show that person-dependent systems result in higher accuracy and suitable to be used in personalized systems.

中文翻译：

基于视频的人依赖和人无关面部情绪识别

面部情绪识别是一个具有挑战性的问题，在过去十年中引起了研究人员的注意。在本文中，我们提出了一种用于视频序列中面部情绪识别的系统。然后，我们针对依赖于人的和独立于人的案例评估系统。根据设计系统的目的，训练个性化模型与非个性化模型的重要性不同。在本文中，首先，我们为两个数据集（即 RML 和 SAVEE 数据库）的视频帧计算 60 个几何特征。在下一步中，k-means 聚类应用于几何特征，为每个视频剪辑选择 k 个最具辨别力的帧。然后，我们采用线性支持向量机 (SVM) 和高斯 SVM 等各种分类器来找到最佳代表 k 。最后，五个预训练的卷积神经网络，即 VGG-16、VGG-19、ResNet-50、AlexNet 和 GoogleNet，用于评估两种场景：依赖于人的和独立于人的情绪识别。此外，基于人脸的不同区域，研究了几何特征在人依赖和人独立场景的关键帧选择中的影响。此外，CNN 提取的特征使用 t 分布随机邻域嵌入算法进行可视化，以研究这些场景中的判别能力。实验表明，依赖于人的系统具有更高的准确性，适合用于个性化系统。基于人脸的不同区域，研究了几何特征在人依赖和人独立场景的关键帧选择中的影响。此外，CNN 提取的特征使用 t 分布随机邻域嵌入算法进行可视化，以研究这些场景中的判别能力。实验表明，依赖于人的系统具有更高的准确性，适合用于个性化系统。基于人脸的不同区域，研究了几何特征在人依赖和人独立场景的关键帧选择中的影响。此外，CNN 提取的特征使用 t 分布随机邻域嵌入算法进行可视化，以研究这些场景中的判别能力。实验表明，依赖于人的系统具有更高的准确性，适合用于个性化系统。

更新日期：2021-01-19

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>