Dual-graph regularized discriminative transfer sparse coding for facial expression recognition
Introduction
The goal of facial expression recognition is to recognize expressions from facial images. It has attracted increasing interests due to its applications in far-reaching fields, e.g., computer vision, multimedia entertainment, human-computer intelligent interactions and so on [1], [2]. Compared with other emotional expressions, the emotional states conveyed by facial expressions are more expressive. Thus, facial expression recognition has drawn more and more attentions. In [3], Ekman et al. have defined six basic facial expression categories, i.e., anger, disgust, fear, happiness, sadness and surprise, which are illustrated in Fig. 1. Most existing works focus on recognizing these six basic expressions, which are universal and recognizable across different cultures.
Overall, a facial expression recognition system can be roughly partitioned into two major parts, i.e., facial expression feature extraction and representation and facial expression classification. The main task of the first part is to extract a set of the facial expression features that are related with the expressions of humans, whereas the latter one is to determine the emotion categories based on the extracted facial expression features. For the former one, many facial expression extraction algorithms have been successfully proposed, e.g., scale-invariant feature transform (SIFT) [4], local binary patterns (LBP) [5], histograms of oriented gradients (HOG) [6], Gabor wavelet [7], facial movement features [8], features from salient facial patches [9], local binary pattern histograms from three orthogonal planes (LBP-TOP) [10] and features extracted using different deep learning algorithms [11]. For the latter one, all kinds of approaches have been proposed in the literature, e.g., support vector machine (SVM) [12], hidden Markov model (HMM) [13], AdaBoost [14] and deep neural networks [11], [15], [16]. These methods can achieve satisfying results in most facial expression recognition tasks.
In recent years, sparse coding has been successfully used in computer vision applications, like face recognition [17] [18], image classification [19], image restoration [20], etc. In computer vision, the high dimensionality of the feature vector is a tricky problem. Sparse coding can represent high-dimensional feature vectors as a linear combination of a small number of basis vectors and generate sparse representations. In this way, the high-dimensional feature vector can be represented by only a small number of effective coefficients, which is easy to be interpreted and can greatly reduce the computational costs of subsequent work. Recent studies [21], [22], [23] have shown that sparse coding is one of the successful representation models for facial expression recognition. For example, Tariq et al. [21] develop a generic sparse coding feature for non-frontal facial expression recognition. In [22], Jampour et al. present a multi-view facial expression recognition method by using a local linear regression of sparse codes. In [23], by using the label information, Chanti et al. have learned a discriminative dictionary for sparse representation to recognize the spontaneous facial expressions.
The above mentioned methods are mostly carried out on the assumption that the training and testing data are from the same dataset, i.e., they assume that the data are sampled from a common shared distribution [24]. However, this assumption does not hold, and would suffer a heavy drop in recognition performance. To address this problem, in recent years, with the development of transfer learning [25], [26], the cross-dataset facial expression recognition using transfer learning algorithms has become a hot research topic. In [27], Yan et al. propose a transfer subspace learning approach to learn a robust feature subspace for cross-dataset facial expression, which can transfer the knowledge gained from the source domain to the target domain to improve the recognition performance. In [28], Chu et al. introduce a selective transfer machine (STM) method for personalized facial expression analysis. By re-weighting the training samples, it can reduce the mismatch between the training and testing datasets. Zheng et al. [29] have proposed a transductive transfer regularized least-squares regression (TTRLSR) model to solve the cross-domain facial expression recognition problem. Note that for cross-domain problems, when we directly use the above mentioned sparse coding methods, the data sampled from different distributions may be quantized into different visual words of the codebook, and encoded with different representations [30], which makes the learned dictionary unable to effectively encode images and will greatly challenge the robustness of existing sparse coding algorithms for cross-dataset recognition problems. Thus, by exploiting the transfer learning techniques, Long et al. [30] develop a transfer sparse coding (TSC) method, which combines traditional sparse constraints with the distance measurement, i.e., maximum mean discrepancy (MMD) [31], [32]. However, TSC only simply introduces the MMD constraint into sparse coding and neglects the intra-domain and inter-domain graphs and discriminative label information, such information is important for classification.
In this paper, to deal with the challenging cross-dataset facial expression recognition problem, inspired by recent progress in sparse coding and transfer learning, we propose a novel dual-graph regularized discriminative transfer sparse coding (DGDTSC) method for robust facial expression recognition. The core idea of DGDTSC lies in seeking the robust transfer sparse codes to reduce the distribution divergence between two different databases. In this way, the source knowledge can be well adapted to facilitate the target expression recognition. Fig. 2 shows the diagram of our approach.
The major contributions of our work are summarized as follows:
- •
Our algorithm provides a unified transfer learning framework, which elegantly combines sparse coding, dual-graph Laplacian regularization and discriminative regularization. Experimental results on cross-dataset facial expression recognition tasks show its superiority.
- •
When learning sparse codes, we construct a dual-graph to explicitly measure the inter-domain and intra-domain similarity among different datasets, which can not only provide an effective guidance for similarity measurement of transfer learning, but also preserve the local geometric structural information of features.
- •
We explicitly attempt to jointly take into account MMD and dual-graph as the distance metric. Therefore, the global and local distance measurement across domains can be well preserved, which can effectively reduce the distribution divergence between different datasets.
- •
By utilizing the label information, we further introduce a discriminative constraint which aims to minimize the intra-class compactness and maximize the inter-class separability. Thus, the learned sparse representations can have more discriminative power.
- •
A mathematical model of our proposed method is presented and an efficient optimization algorithm is applied to solve our model. We validate the results by comparing with state-of-the-art transfer learning algorithms. We further demonstrate the effectiveness of our method by a series of experiments.
The rest of this paper is organized as follows: We review the related works on sparse coding and transfer learning in Section 2. In Sections 3 and 4, we introduce our proposed algorithms, i.e. DGTSC and DGDTSC, as well as the optimization scheme, including learning sparse representations and dictionaries, respectively. The experimental results on cross-database facial expression recognition tasks are presented in Section 5. Finally, we conclude our work in Section 6.
Section snippets
Related work
In this section, we discuss the existing works which are the most related to our proposed method, including sparse coding and transfer learning, and show the incoherent relationship among these methods.
Proposed methods
In this section, we first present the notations utilized in this work. Then, we elaborate the details of our proposed methods.
Optimization algorithm
In this section, we discuss the optimization of DGDTSC. The optimization algorithm of problem (18) is divided into two iterative steps: 1) learning transfer sparse codes S with dictionary B fixed; and 2) learning dictionary B with transfer sparse codes S fixed.
Experiments
In this section, we conduct extensive experiments of cross-database facial expression recognition on publicly available facial expression datasets to evaluate our methods.
Conclusion
In this paper, we have proposed a novel transfer sparse coding approach, called dual-graph regularized discriminative transfer sparse coding (DGDTSC), for robust cross-dataset facial expression recognition. An important advantage of our method is constructing a dual-graph to preserve the inter-domain and intra-domain geometrical information, which can effectively reduce the distribution divergence between different datasets. Moreover, we encode the discriminative information into sparse coding,
CRediT authorship contribution statement
Dongliang Chen: Methodology, Experiments, Writing – original draft. Peng Song: Supervision, Methodology, Writing – review & editing, Funding acquisition.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
This work is supported in part by the National Natural Science Foundation of China under Grant 61703360 and the Fundamental Research Funds for the Central Universities under Grant CDLS-2019-01.
Dongliang Chen received the B.S. degree in Computer Science from Yantai University, Yantai, China, in 2018. He is currently pursuing the M.S. degree in Computer Science at Yantai University. His current main research interests include affective computing and transfer learning.
References (67)
- et al.
Facial expression recognition based on local binary patterns: a comprehensive study
Image Vis. Comput.
(2009) - et al.
Facial expression recognition using radial encoding of local Gabor features and classifier synthesis
Pattern Recognit.
(2012) - et al.
A neural-adaboost based facial expression recognition system
Expert Syst. Appl.
(2014) - et al.
Fisher discrimination dictionary pair learning for image classification
Neurocomputing
(2017) Transfer subspace learning for cross-dataset facial expression recognition
Neurocomputing
(2016)- et al.
A selective multiple instance transfer learning method for text categorization problems
Knowl.-Based Syst.
(2018) - et al.
Automatic analysis of facial expressions: the state of the art
IEEE Trans. Pattern Anal. Mach. Intell.
(2000) - et al.
Constants across cultures in the face and emotion
J. Pers. Soc. Psychol.
(1971) - et al.
A deep neural network-driven feature learning method for multi-view facial expression recognition
IEEE Trans. Multimed.
(2016) - et al.
Learning multiscale active facial patches for expression analysis
IEEE Trans. Cybern.
(2015)
Emotion recognition using dynamic grid-based hog features
Facial expression recognition using facial movement features
IEEE Trans. Affect. Comput.
Automatic facial expression recognition using features of salient facial patches
IEEE Trans. Affect. Comput.
Dynamic texture recognition using local binary patterns with an application to facial expressions
IEEE Trans. Pattern Anal. Mach. Intell.
Deep facial expression recognition: a survey
Facial expression recognition using svm classification on mic-macro patterns
Automatic facial expression recognition using facial animation parameters and multistream hmms
IEEE Trans. Inf. Forensics Secur.
Automatic facial expression recognition system using deep network-based data fusion
IEEE Trans. Cybern.
A compact deep learning model for robust facial expression recognition
Robust face recognition via sparse representation
IEEE Trans. Pattern Anal. Mach. Intell.
Local features are not lonely–Laplacian sparse coding for image classification
Non-local sparse models for image restoration
Multi-view facial expression recognition analysis with generic sparse coding feature
Multi-view facial expressions recognition using local linear regression of sparse codes
Spontaneous facial expression recognition using sparse representation
Transfer linear subspace learning for cross-corpus speech emotion recognition
IEEE Trans. Affect. Comput.
A survey on transfer learning
IEEE Trans. Knowl. Data Eng.
Transfer learning for visual categorization: a survey
IEEE Trans. Neural Netw. Learn. Syst.
Selective transfer machine for personalized facial expression analysis
IEEE Trans. Pattern Anal. Mach. Intell.
Cross-domain color facial expression recognition using transductive transfer subspace learning
IEEE Trans. Affect. Comput.
Transfer sparse coding for robust image representation
A kernel method for the two-sample-problem
Domain adaptation via transfer component analysis
IEEE Trans. Neural Netw.
Cited by (8)
Adaptive graph regularized transferable regression for facial expression recognition
2023, Digital Signal Processing: A Review JournalCommon Latent Embedding Space for Cross-Domain Facial Expression Recognition
2024, IEEE Transactions on Computational Social SystemsAdaptive Graph Modeling With Self-Training for Heterogeneous Cross-Scene Hyperspectral Image Classification
2024, IEEE Transactions on Geoscience and Remote SensingFacial expression recognition based on strong attention mechanism and residual network
2023, Multimedia Tools and ApplicationsTHE SUCCESS OF FACIAL EXPRESSION RECOGNITION BY CARRIERS OF VARIOUS GENOTYPES OF THE COMT, DRD4, 5HT2A, MAOA GENES
2022, Experimental Psychology (Russia)
Dongliang Chen received the B.S. degree in Computer Science from Yantai University, Yantai, China, in 2018. He is currently pursuing the M.S. degree in Computer Science at Yantai University. His current main research interests include affective computing and transfer learning.
Peng Song is currently an associate professor with the school of computer and control engineering, Yantai University, China. He received the B.S. degree in EE from Shandong University of Science and Technology, China in 2006, the M.E. and P.h.D degrees in EE both from Southeast University, China in 2009 and 2014, respectively. His current main research interests include affective computing and pattern recognition.