Unsupervised sign language validation process based on hand-motion parameter clustering

doi:10.1016/j.csl.2021.101256

Computer Speech & Language

Volume 71, January 2022, 101256

https://doi.org/10.1016/j.csl.2021.101256 Get rights and content

Highlights

•
We present an automatic sign selection and validation solution based on unsupervised clustering of sign motion parameters related to the different sign replicas.
•
We conducted an experimental study to validate 300 ASL signs based on four unsupervised clustering methods, namely, Kernel PCA Kmeans, GMM, Spectral clustering and kernel Kmeans.
•
We concluded that the use our sign validation process using Spectral clustering method allows us to select the right sign replicas to be used to generate the user sign signature. The use of our unsupervised sign validation process onto 3000 ASL sign replicas (300 sign * 10 replicas) lead us to enhance the R2 score average from 0.4830 without sign validation to 0.9123 with sign validation compared to expert sign signature.

Abstract

Automatic sign language translation process relies mainly on dictionaries of signs to interpret the right meaning of gestures. Due to the lack of large multi sign language dictionaries covering all the aspect of sign languages, the collaborative approach to create signs becomes essential. In fact, the collaborative sign creation process based on Kinect motion capture tool requires the collaboration of non expert users to make sign language dictionaries. However, due to the availability constraint of sign language experts to validate the created signs and the huge amount of signs to be validated manually, the automatic sign language validation process becomes the most suitable solution. In this paper, we present a new automatic and unsupervised sign validation process based on machine learning techniques applied on sign replicas. Given a set of replicas (records) of the same sign created by different non expert sign language user, our main goal is to select the adequate sign records to be used to generate the closest sign signature compared to the one created by sign language expert. For this aim, we present an automatic sign selection and validation solution based on unsupervised clustering of sign motion parameters related to the different sign replicas. We conducted an experimental study to validate 300 ASL signs based on four unsupervised clustering methods, namely, Kernel PCA Kmeans, GMM, Spectral clustering and kernel Kmeans. We concluded that the use our sign validation process using Spectral clustering method allows us to select the right sign replicas to be used to generate the user sign signature. The use of our unsupervised sign validation process onto 3000 ASL sign replicas (300 sign * 10 replicas) lead us to enhance the R2 score average from 0.4830 without sign validation to 0.9123 with sign validation compared to expert sign signature.

Introduction

Language is a universal communication system based on the use of a set of symbols to transmit cognitive information. These symbols can be expressed by letters and numbers or by expressive gestures. Indeed, linguists consider that the use of gestures in voice communication is mainly to compensate the expressive lack linked to the ability to transmit information in the spoken language. On the other hand, when it comes to hearing impairment, the gestural communication system is considered as the best means of communication.

In fact, sign language is the basic gestural communication system for the community of people with hearing impairments. In addition, we can distinguish two degrees of hearing impairment, namely, profound and general hearing impairment. People with deep deafness can wear a hearing aid to compensate the hearing loss. Some of these people have the ability to read lips and understand communication, some others support their discussions with sign language gestures. However, people with deep deafness have neither access to audio information nor the ability to read lips which explains the exclusive use of sign language.

In recent years, there have been the emergence of studies proving that sign languages are considered as an entire human language. This recognition was a real turning point in the daily life of people with hearing impairments. According to the World Federation of deaf, there are more than 70 million hard of hearing people in the world. These people use more than 35 different sign languages, including 20 languages officially recognized by different states. This diversity is explained by the variation in vocabulary adopted from one country to another and even seeing from one region to another. Likewise, the cultural diversity and the different way of life of each community have a considerable impact on the nature and composition of gestures in sign language.

On the other hand, numerous sign language research studies have shown that the learning process for the hearing impaired is different from the classical learning process. In 1996 Marschark and Harris (1996) stated that the learning process for the hearing impaired with a severe impairment is extremely slow compared to hearing people. The gain in experience acquired by hearing-impaired children in four years is equivalent to the gain of one year for hearing children. This is explained by the difficulties linked to the acquisition of grammatical and conjugation rules (verb tenses, gender and number concordance, etc.) as well as the ability to create mental images for abstract concepts (math). Therefore, all of these problems make communication between the hearing and the hearing impaired people, very complicated.

During the last five years, many sign language recognition research based on deep learning have been conducted to improve the recognition process and therefore to facilitate the communication process between hard of hearing and hearing people. Unfortunately, it is well known that in order to obtain an acceptable recognition accuracy, deep learning methods such as convolutional neural networks require big training dataset. Consequently, due to the lack of big multi-sign language 2D and 3D data sets, deep learning based methods cannot offer the optimal solution. Similarly, except some attempts to create collaborative sign language creation tools such as the work of Jemni and Elghoul, 2008, Li et al., 2020, Dreuw and Ney, 2008 and Athitsos et al. (2008), there is a lack of publicly available tools which allow users to create their own sign and therefore to obtain big training dataset.

In fact, the sign creation process is based on the collaboration of users to create sign language dictionaries. In other words, based on this methods, it is very common to obtain several replicas of the same sign, which poses the problem of sign validation. Indeed, the problem of validation of signs, can be resolved either by having recourse to a sign language expert (supervised approach), or by adopting an automatic method of validation (unsupervised approach). The choice of validation method depends on the effectiveness of the method chosen as well as the time taken by the validation process. Admittedly, the supervised approach (depends on the expert) to validate the signs remains the best alternative. However, given the large number of signs to be treated as well as the availability of the expert, the unsupervised validation approach would be the most suitable choice.

Our validation process is based on the sign signature extraction from redundant signs relayed on hand motion. The main idea behind this work, is to select automatically the most common sign motion parameters from a list of redundant signs in order to identify a global signature characterizing the hand motion related to each different sign. This research is dedicated to end-to-end sign language translation systems based on virtual 3d signer (such as Websign translation system Jemni and Elghoul, 2008). Additionally, this work is an extension of the sign validation process presented briefly in the work of Boulares and Jemni (2019). The remainder of this paper is organized as follows. In Section 2, related work is reviewed. In Section 3, we present our sign language validation process. In Section 4, the experimental results and evaluation are provided. Finally, we conclude our work in Section 5.

Section snippets

Related work

The last few years have seen the emergence of automatic sign language data analysis. Several works have been carried out with the aim of analyzing 2D (video) or 3D (virtual animation) data. Among these works, we can cite the work of Deshpande and Kalbhor (2020) which are based on a Convolutional Neural Network (CNN) to recognize Marathi sign language alphabets. In the research conducted by Rastgoo et al. (2020), the authors relayed on deep learning-based pipeline architecture exploiting 2D

Our contribution

The automatic sign language validation process relies on sign feature clustering. As shown in Fig. 1, given a set of sign features issued from hand motion trajectory estimation (Boulares and Jemni, 2019) of several sign replica, our goal is to select the most common sign samples sharing the same features. This automatic selection process is done using clustering algorithms through grouping sign language features into two different groups. We assumed the hypothesis that the cluster having the

Experiments and discussion

Our automatic sign validation algorithm relies essentially on sign selection based on homogeneous motion parameters issued from the clustering process. In other word, having a set of replicas of the same sign, our main goal is to choose automatically the appropriate sign motion parameter from a set of homogeneous sign replicas for the aim to generate sign signature which will be compared to the expert sign signature. In this section, we conduct a comparative study in order to explain the impact

Conclusion and perspectives

The automatic sign language validation is a crucial step to create homogeneous, efficient and non ambiguous sign language dictionaries. In this work, we presented a novel sign language validation process based on the analysis of the sign motion parameters. Our sign selection strategy relies on detecting and isolating the wrong sign language replicas based on the homogeneity constraint issued from motion parameters clustering. We studied the impact of our automatic sign validation process based

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (35)

RastgooR. et al.
Hand sign language recognition using multi-view hand skeleton
Expert Syst. Appl.
(2020)
AlmeidaS.G.M. et al.
Feature extraction in Brazilian Sign Language Recognition based on phonological structure and using RGB-D sensors
Expert Syst. Appl.
(2014)
AthitsosV. et al.
The American sign language lexicon video dataset
BishopC.M.
Pattern Recognition and Machine Learning
(2006)
BoularesM. et al.
Automatic hand motion analysis for the sign language space management
Pattern Anal. Appl.
(2019)
CuiR. et al.
A deep neural framework for continuous sign language recognition by iterative training
IEEE Trans. Multimed.
(2019)
DeshpandeA.M. et al.
Video-based marathi sign language recognition and text conversion using convolutional neural network
DhillonI.S. et al.
Kernel k-means: spectral clustering and normalized cuts
Dreuw, P., Ney, H., 2008. Towards automatic sign language annotation for the elan tool. In: Workshop Programme, Vol....
EscobedoE. et al.
Dynamic sign language recognition based on convolutional neural networks and texture maps

HartiganJ.A. et al.

Algorithm AS 136: A k-means clustering algorithm

Appl. Stat.

(1979)

HassanM. et al.

Multiple proposals for continuous arabic sign language recognition

Sens. Imaging

(2019)

HastieT. et al.

The elements of statistical learning: data mining, inference and prediction

Math. Intelligencer

(2005)

JemniM. et al.

A system to make signs using collaborative approach

JiangX.

Isolated Chinese sign language recognition using gray-level co-occurrence matrix and parameter-optimized medium Gaussian support vector machine

KimK.I. et al.

Iterative kernel principal component analysis for image modeling

IEEE Trans. Pattern Anal. Mach. Intell.

(2005)

KiranP.S. et al.

Investigation of 3-D relational geometric features for kernel-based 3-D sign language recognition

Cited by (1)

Information System for Learning Control in Teaching Russian Sign Language: Process and Data Modeling
2022, International Journal of Instruction

View full text

Unsupervised sign language validation process based on hand-motion parameter clustering

Highlights

Abstract

Introduction

Section snippets

Related work

Our contribution

Experiments and discussion

Conclusion and perspectives

Declaration of Competing Interest

Expert Syst. Appl.

Feature extraction in Brazilian Sign Language Recognition based on phonological structure and using RGB-D sensors

Expert Syst. Appl.

The American sign language lexicon video dataset

Pattern Recognition and Machine Learning

Automatic hand motion analysis for the sign language space management

Pattern Anal. Appl.

A deep neural framework for continuous sign language recognition by iterative training

IEEE Trans. Multimed.

Video-based marathi sign language recognition and text conversion using convolutional neural network

Kernel k-means: spectral clustering and normalized cuts

Dynamic sign language recognition based on convolutional neural networks and texture maps