Dual-graph regularized discriminative transfer sparse coding for facial expression recognition

doi:10.1016/j.dsp.2020.102906

Digital Signal Processing

Volume 108, January 2021, 102906

https://doi.org/10.1016/j.dsp.2020.102906 Get rights and content

Abstract

Facial expression recognition has recently received an increasing attention due to its great potentiality in real world applications. Conventional facial expression recognition is often conducted on the assumption that training data and testing data are obtained from the same dataset. However, in reality, the data are often collected from different devices or environments, which will severely degrade the recognition performance. To tackle this problem, in this paper, we investigate the cross-dataset facial expression recognition problem, and propose a novel dual-graph regularized transfer sparse coding method (DGTSC). Specifically, aiming to reduce the distribution divergence of different databases while preserving the geometrical structures, we construct a dual-graph, by defining the inter-domain and intra-domain similarity, to measure the distance between different databases. Moreover, we further present a dual-graph regularized discriminative transfer sparse coding method (DGDTSC), which exploits the label information, to make our model has more discriminative power. Extensive experimental results and analysis on several facial expression datasets show the feasibility and effectiveness of the proposed methods.

Introduction

The goal of facial expression recognition is to recognize expressions from facial images. It has attracted increasing interests due to its applications in far-reaching fields, e.g., computer vision, multimedia entertainment, human-computer intelligent interactions and so on [1], [2]. Compared with other emotional expressions, the emotional states conveyed by facial expressions are more expressive. Thus, facial expression recognition has drawn more and more attentions. In [3], Ekman et al. have defined six basic facial expression categories, i.e., anger, disgust, fear, happiness, sadness and surprise, which are illustrated in Fig. 1. Most existing works focus on recognizing these six basic expressions, which are universal and recognizable across different cultures.

Overall, a facial expression recognition system can be roughly partitioned into two major parts, i.e., facial expression feature extraction and representation and facial expression classification. The main task of the first part is to extract a set of the facial expression features that are related with the expressions of humans, whereas the latter one is to determine the emotion categories based on the extracted facial expression features. For the former one, many facial expression extraction algorithms have been successfully proposed, e.g., scale-invariant feature transform (SIFT) [4], local binary patterns (LBP) [5], histograms of oriented gradients (HOG) [6], Gabor wavelet [7], facial movement features [8], features from salient facial patches [9], local binary pattern histograms from three orthogonal planes (LBP-TOP) [10] and features extracted using different deep learning algorithms [11]. For the latter one, all kinds of approaches have been proposed in the literature, e.g., support vector machine (SVM) [12], hidden Markov model (HMM) [13], AdaBoost [14] and deep neural networks [11], [15], [16]. These methods can achieve satisfying results in most facial expression recognition tasks.

In recent years, sparse coding has been successfully used in computer vision applications, like face recognition [17] [18], image classification [19], image restoration [20], etc. In computer vision, the high dimensionality of the feature vector is a tricky problem. Sparse coding can represent high-dimensional feature vectors as a linear combination of a small number of basis vectors and generate sparse representations. In this way, the high-dimensional feature vector can be represented by only a small number of effective coefficients, which is easy to be interpreted and can greatly reduce the computational costs of subsequent work. Recent studies [21], [22], [23] have shown that sparse coding is one of the successful representation models for facial expression recognition. For example, Tariq et al. [21] develop a generic sparse coding feature for non-frontal facial expression recognition. In [22], Jampour et al. present a multi-view facial expression recognition method by using a local linear regression of sparse codes. In [23], by using the label information, Chanti et al. have learned a discriminative dictionary for sparse representation to recognize the spontaneous facial expressions.

The above mentioned methods are mostly carried out on the assumption that the training and testing data are from the same dataset, i.e., they assume that the data are sampled from a common shared distribution [24]. However, this assumption does not hold, and would suffer a heavy drop in recognition performance. To address this problem, in recent years, with the development of transfer learning [25], [26], the cross-dataset facial expression recognition using transfer learning algorithms has become a hot research topic. In [27], Yan et al. propose a transfer subspace learning approach to learn a robust feature subspace for cross-dataset facial expression, which can transfer the knowledge gained from the source domain to the target domain to improve the recognition performance. In [28], Chu et al. introduce a selective transfer machine (STM) method for personalized facial expression analysis. By re-weighting the training samples, it can reduce the mismatch between the training and testing datasets. Zheng et al. [29] have proposed a transductive transfer regularized least-squares regression (TTRLSR) model to solve the cross-domain facial expression recognition problem. Note that for cross-domain problems, when we directly use the above mentioned sparse coding methods, the data sampled from different distributions may be quantized into different visual words of the codebook, and encoded with different representations [30], which makes the learned dictionary unable to effectively encode images and will greatly challenge the robustness of existing sparse coding algorithms for cross-dataset recognition problems. Thus, by exploiting the transfer learning techniques, Long et al. [30] develop a transfer sparse coding (TSC) method, which combines traditional sparse constraints with the distance measurement, i.e., maximum mean discrepancy (MMD) [31], [32]. However, TSC only simply introduces the MMD constraint into sparse coding and neglects the intra-domain and inter-domain graphs and discriminative label information, such information is important for classification.

In this paper, to deal with the challenging cross-dataset facial expression recognition problem, inspired by recent progress in sparse coding and transfer learning, we propose a novel dual-graph regularized discriminative transfer sparse coding (DGDTSC) method for robust facial expression recognition. The core idea of DGDTSC lies in seeking the robust transfer sparse codes to reduce the distribution divergence between two different databases. In this way, the source knowledge can be well adapted to facilitate the target expression recognition. Fig. 2 shows the diagram of our approach.

The major contributions of our work are summarized as follows:

•
Our algorithm provides a unified transfer learning framework, which elegantly combines sparse coding, dual-graph Laplacian regularization and discriminative regularization. Experimental results on cross-dataset facial expression recognition tasks show its superiority.
•
When learning sparse codes, we construct a dual-graph to explicitly measure the inter-domain and intra-domain similarity among different datasets, which can not only provide an effective guidance for similarity measurement of transfer learning, but also preserve the local geometric structural information of features.
•
We explicitly attempt to jointly take into account MMD and dual-graph as the distance metric. Therefore, the global and local distance measurement across domains can be well preserved, which can effectively reduce the distribution divergence between different datasets.
•
By utilizing the label information, we further introduce a discriminative constraint which aims to minimize the intra-class compactness and maximize the inter-class separability. Thus, the learned sparse representations can have more discriminative power.
•
A mathematical model of our proposed method is presented and an efficient optimization algorithm is applied to solve our model. We validate the results by comparing with state-of-the-art transfer learning algorithms. We further demonstrate the effectiveness of our method by a series of experiments.

The rest of this paper is organized as follows: We review the related works on sparse coding and transfer learning in Section 2. In Sections 3 and 4, we introduce our proposed algorithms, i.e. DGTSC and DGDTSC, as well as the optimization scheme, including learning sparse representations and dictionaries, respectively. The experimental results on cross-database facial expression recognition tasks are presented in Section 5. Finally, we conclude our work in Section 6.

Section snippets

Related work

In this section, we discuss the existing works which are the most related to our proposed method, including sparse coding and transfer learning, and show the incoherent relationship among these methods.

Proposed methods

In this section, we first present the notations utilized in this work. Then, we elaborate the details of our proposed methods.

Optimization algorithm

In this section, we discuss the optimization of DGDTSC. The optimization algorithm of problem (18) is divided into two iterative steps: 1) learning transfer sparse codes S with dictionary B fixed; and 2) learning dictionary B with transfer sparse codes S fixed.

Experiments

In this section, we conduct extensive experiments of cross-database facial expression recognition on publicly available facial expression datasets to evaluate our methods.

Conclusion

In this paper, we have proposed a novel transfer sparse coding approach, called dual-graph regularized discriminative transfer sparse coding (DGDTSC), for robust cross-dataset facial expression recognition. An important advantage of our method is constructing a dual-graph to preserve the inter-domain and intra-domain geometrical information, which can effectively reduce the distribution divergence between different datasets. Moreover, we encode the discriminative information into sparse coding,

CRediT authorship contribution statement

Dongliang Chen: Methodology, Experiments, Writing – original draft. Peng Song: Supervision, Methodology, Writing – review & editing, Funding acquisition.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This work is supported in part by the National Natural Science Foundation of China under Grant 61703360 and the Fundamental Research Funds for the Central Universities under Grant CDLS-2019-01.

Dongliang Chen received the B.S. degree in Computer Science from Yantai University, Yantai, China, in 2018. He is currently pursuing the M.S. degree in Computer Science at Yantai University. His current main research interests include affective computing and transfer learning.

References (67)

C. Shan et al.
Facial expression recognition based on local binary patterns: a comprehensive study
Image Vis. Comput.
(2009)
W. Gu et al.
Facial expression recognition using radial encoding of local Gabor features and classifier synthesis
Pattern Recognit.
(2012)
E. Owusu et al.
A neural-adaboost based facial expression recognition system
Expert Syst. Appl.
(2014)
M. Yang et al.
Fisher discrimination dictionary pair learning for image classification
Neurocomputing
(2017)
H. Yan
Transfer subspace learning for cross-dataset facial expression recognition
Neurocomputing
(2016)
B. Liu et al.
A selective multiple instance transfer learning method for text categorization problems
Knowl.-Based Syst.
(2018)
M. Pantic et al.
Automatic analysis of facial expressions: the state of the art
IEEE Trans. Pattern Anal. Mach. Intell.
(2000)
P. Ekman et al.
Constants across cultures in the face and emotion
J. Pers. Soc. Psychol.
(1971)
T. Zhang et al.
A deep neural network-driven feature learning method for multi-view facial expression recognition
IEEE Trans. Multimed.
(2016)
L. Zhong et al.
Learning multiscale active facial patches for expression analysis
IEEE Trans. Cybern.
(2015)

M. Dahmane et al.

Emotion recognition using dynamic grid-based hog features

L. Zhang et al.

Facial expression recognition using facial movement features

IEEE Trans. Affect. Comput.

(2011)

S. Happy et al.

Automatic facial expression recognition using features of salient facial patches

IEEE Trans. Affect. Comput.

(2015)

G. Zhao et al.

Dynamic texture recognition using local binary patterns with an application to facial expressions

IEEE Trans. Pattern Anal. Mach. Intell.

(2007)

S. Li et al.

Deep facial expression recognition: a survey

H. Khalifa et al.

Facial expression recognition using svm classification on mic-macro patterns

P.S. Aleksic et al.

Automatic facial expression recognition using facial animation parameters and multistream hmms

IEEE Trans. Inf. Forensics Secur.

(2006)

A. Majumder et al.

Automatic facial expression recognition system using deep network-based data fusion

IEEE Trans. Cybern.

(2018)

C.-M. Kuo et al.

A compact deep learning model for robust facial expression recognition

W. John et al.

Robust face recognition via sparse representation

IEEE Trans. Pattern Anal. Mach. Intell.

(2009)

S. Gao et al.

Local features are not lonely–Laplacian sparse coding for image classification

J. Mairal et al.

Non-local sparse models for image restoration

U. Tariq et al.

Multi-view facial expression recognition analysis with generic sparse coding feature

M. Jampour et al.

Multi-view facial expressions recognition using local linear regression of sparse codes

D.A. Chanti et al.

Spontaneous facial expression recognition using sparse representation

P. Song

Transfer linear subspace learning for cross-corpus speech emotion recognition

IEEE Trans. Affect. Comput.

(2019)

S.J. Pan et al.

A survey on transfer learning

IEEE Trans. Knowl. Data Eng.

(2010)

L. Shao et al.

Transfer learning for visual categorization: a survey

IEEE Trans. Neural Netw. Learn. Syst.

(2015)

W.-S. Chu et al.

Selective transfer machine for personalized facial expression analysis

IEEE Trans. Pattern Anal. Mach. Intell.

(2017)

W. Zheng et al.

Cross-domain color facial expression recognition using transductive transfer subspace learning

IEEE Trans. Affect. Comput.

(2018)

M. Long et al.

Transfer sparse coding for robust image representation

A. Gretton et al.

A kernel method for the two-sample-problem

S.J. Pan et al.

Domain adaptation via transfer component analysis

IEEE Trans. Neural Netw.

(2011)

Cited by (8)

Adaptive graph regularized transferable regression for facial expression recognition
2023, Digital Signal Processing: A Review Journal
Facial expression recognition (FER) has tremendous potential in affective computing and human-computer interaction fields. Traditional FER algorithms usually perform the training and testing of models on a single domain. However, face images in real environments usually come from different domains, which greatly limits the performance of traditional algorithms. To handle this cross-domain recognition problem, we put forward a novel transfer learning model, called adaptive graph regularized transferable regression (AGTR), which can learn a discriminative projection matrix by embedding relaxed label regression, class sparsity structure, and adaptive graph structure into a unified framework. To be specific, in our method, we develop a relaxed label regression to learn a projection matrix. Then, we exploit a class sparsity structure in each class of samples separately to obtain a consistent subspace. Further, we design a novel adaptive graph structure, which can adaptively discover the geometric relationship between samples. Finally, we verify the advancement of our approach on four public facial expression databases.
Common Latent Embedding Space for Cross-Domain Facial Expression Recognition
2024, IEEE Transactions on Computational Social Systems
Adaptive Graph Modeling With Self-Training for Heterogeneous Cross-Scene Hyperspectral Image Classification
2024, IEEE Transactions on Geoscience and Remote Sensing
Facial expression recognition based on strong attention mechanism and residual network
2023, Multimedia Tools and Applications
Single sample per person face recognition algorithm based on the robust prototype dictionary and robust variation dictionary construction
2022, IET Image Processing
THE SUCCESS OF FACIAL EXPRESSION RECOGNITION BY CARRIERS OF VARIOUS GENOTYPES OF THE COMT, DRD4, 5HT2A, MAOA GENES
2022, Experimental Psychology (Russia)

View all citing articles on Scopus

Peng Song is currently an associate professor with the school of computer and control engineering, Yantai University, China. He received the B.S. degree in EE from Shandong University of Science and Technology, China in 2006, the M.E. and P.h.D degrees in EE both from Southeast University, China in 2009 and 2014, respectively. His current main research interests include affective computing and pattern recognition.

View full text

Dual-graph regularized discriminative transfer sparse coding for facial expression recognition

Abstract

Introduction

Section snippets

Related work

Proposed methods

Optimization algorithm

Experiments

Conclusion

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgements

Image Vis. Comput.

Pattern Recognit.

Expert Syst. Appl.

Neurocomputing

Neurocomputing

Knowl.-Based Syst.

Automatic analysis of facial expressions: the state of the art

IEEE Trans. Pattern Anal. Mach. Intell.

Constants across cultures in the face and emotion

J. Pers. Soc. Psychol.

A deep neural network-driven feature learning method for multi-view facial expression recognition

IEEE Trans. Multimed.

Learning multiscale active facial patches for expression analysis

IEEE Trans. Cybern.

Emotion recognition using dynamic grid-based hog features

Facial expression recognition using facial movement features

IEEE Trans. Affect. Comput.

Automatic facial expression recognition using features of salient facial patches

IEEE Trans. Affect. Comput.

Dynamic texture recognition using local binary patterns with an application to facial expressions

IEEE Trans. Pattern Anal. Mach. Intell.

Deep facial expression recognition: a survey

Facial expression recognition using svm classification on mic-macro patterns

Automatic facial expression recognition using facial animation parameters and multistream hmms

IEEE Trans. Inf. Forensics Secur.

Automatic facial expression recognition system using deep network-based data fusion

IEEE Trans. Cybern.

A compact deep learning model for robust facial expression recognition

Robust face recognition via sparse representation

IEEE Trans. Pattern Anal. Mach. Intell.

Local features are not lonely–Laplacian sparse coding for image classification

Non-local sparse models for image restoration

Multi-view facial expression recognition analysis with generic sparse coding feature

Multi-view facial expressions recognition using local linear regression of sparse codes

Spontaneous facial expression recognition using sparse representation

Transfer linear subspace learning for cross-corpus speech emotion recognition

IEEE Trans. Affect. Comput.

A survey on transfer learning

IEEE Trans. Knowl. Data Eng.

Transfer learning for visual categorization: a survey

IEEE Trans. Neural Netw. Learn. Syst.

Selective transfer machine for personalized facial expression analysis

IEEE Trans. Pattern Anal. Mach. Intell.

Cross-domain color facial expression recognition using transductive transfer subspace learning

IEEE Trans. Affect. Comput.

Transfer sparse coding for robust image representation

A kernel method for the two-sample-problem

Domain adaptation via transfer component analysis

IEEE Trans. Neural Netw.