Abstract
The communication between a person from the impaired community with a person who does not understand sign language could be a tedious task. Sign language is the art of conveying messages using hand gestures. Recognition of dynamic hand gestures in American Sign Language (ASL) became a very important challenge that is still unresolved. In order to resolve the challenges of dynamic ASL recognition, a more advanced successor of the Convolutional Neural Networks (CNNs) called 3-D CNNs is employed, which can recognize the patterns in volumetric data like videos. The CNN is trained for classification of 100 words on Boston ASL (Lexicon Video Dataset) LVD dataset with more than 3300 English words signed by 6 different signers. 70% of the dataset is used for Training while the remaining 30% dataset is used for testing the model. The proposed work outperforms the existing state-of-art models in terms of precision (3.7%), recall (4.3%), and f-measure (3.9%). The computing time (0.19 seconds per frame) of the proposed work shows that the proposal may be used in real-time applications.
Similar content being viewed by others
References
Ameen S, Sunil V (2017) A convolutional neural network to classify American Sign Language fingerspelling from depth and colour images, Expert Systems
Athitsos V et al (2008) The american sign language lexicon video dataset, Computer Vision and Pattern Recognition Workshops, IEEE Computer Society Conference on
Cheng WT, Sun Y, Li GF, Jiang GZ, Liu HH (2019) Jointly network: A network based on CNN and RBM for gesture recognition. Neural Comput Appl 31(Suppl 1):309–323
Cui Y, Juyang W (2000) Appearance-based hand sign recognition from intensity image sequences. Comput Vision Image Understand 78.2:157–176
Gao W, Fang G, Zhao D, Chen Y (2004) Transition movement models for large vocabulary continuous sign language recognition. Autom Face Gesture Recognit 553–558
He Y, Li GF, Liao YJ, Sun Y, Kong JY, Jiang GZ, Jiang D, Liu HH (2019) Gesture recognition based on an improved local sparse representation classification algorithm. Clust Comput 22(Suppl 5):10935–10946
Isaacs J, Foo S (2004) Hand pose estimation for american sign language recognition, System Theory, 2004. In: Proceedings of the thirty-sixth southeastern symposium on. IEEE, pp 132–136
Hinton G, Osindero S, Teh Y (2005) A fast learning algorithm for deep belief nets. Neural Comput 18:1527–1554
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313:504–507
Kang B, Tripathi S, Nguyen TQ (2015) Real-time sign language fingerspelling recognition using convolutional neural networks from depth map. In: Pattern recognition (ACPR), 3rd IAPR asian conference on. IEEE
Kingma D, Ba J (2014) Adam: A method for stochastic optimization, arXiv:1412.6980
Koller O et al (2016) Deep sign: Hybrid CNN-HMM for continuous sign language recognition. Proc British Machine Vision Conf 1–6
Koller O et al (2019) Weakly supervised learning with multi-stream CNN-LSTM-HMMs to discover sequential parallelism in sign language videos. IEEE Trans Pattern Anal Machine Intell
Kumar K, Shrimankar D (2017) F-DES: Fast and deep event summarization. IEEE Trans Multimed 20(2):323–334
Lecun Y, Bengio Y (1995) Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, vol 3361
Lecun Y, Bengio Y, Lhinton G (2015) Deep learning. Nature 521:436–444
Lecun Y, Boser B, Denker GE, Henderson D, Howard RE, Hubbard W, et al. (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1:541–551
Lecun Y, Boser B, Denker JS, Howard RE, Hubbard W, Jackel LD, Henderson D (1990) Handwritten digit recognition with a back-propagation network. Adv Neural Inform Process Syst 396–404
Lecun Y, Bottou L, Orr G, Müller K-R (1989) Efficient BackProp. In: Orr G, Müller K-R (eds) Neural networks: Tricks of the trade, vol 1524. Springer, Berlin, pp 9–50
Lecun Y, Galland CC, Hinton GE (1988) GEMINI: Gradient Estimation through matrix inversion after noise injection. InNIPS 141–148
Lecun Y, Jackel L, Boser B, Denker J, Graf H, Guyon I, et al. (1990) Handwritten Digit recognition: Applications of neural net chips and automatic learning, Neurocomputing. Springer, Berlin, pp 303–318
Li Y, Hailong H, Zhangqian Z, Gang Z (2020) SCANet: Sensor-based continuous authentication with two-stream convolutional neural networks. ACM Trans Sensor Netw (TOSN) 16(3):1–27
Li GF, Jiang D, Zhou YL, Jiang GZ, Kong JY, Manogaran G (2019) Human lesion detection method based on image information and brain signal. IEEE Access 7:11533–11542
Li GF, Tang H, Sun Y, Kong JY, Jiang GZ, Jiang D, Tao B, Xu S, Liu HH (2019) Hand gesture recognition based on convolution neural network. Clust Comput 22(Suppl 2):2719–2729
Liang Z-J, Liao S-B, Hu B-Z (2018) 3D convolutional neural networks for dynamic sign language recognition. Comput J 61.11:1724–1736
Liwicki S, Everingham M (2009) Automatic recognition of fingerspelled words in british sign language. In: Computer vision and pattern recognition workshops IEEE Computer Society Conference on, pp 50–57
Ma Y, Gang Z, Shuangquan W, Hongyang Z, Woosub J (2018) SignFi: Sign language recognition using WiFi. Proc ACM on Interact Mob Wearable Ubiquitous Technol 2(1):1–21
Ma J et al (2000) A continuous chinese sign language recognition system. Automat Face Gesture Recognit 428–433
Negin F et al (2018) PRAXIS: Towards automatic cognitive assessment using gesture recognition. Expert Syst Appl 106:21–35
Ong E-J et al (2004) A boosted classifier tree for hand shape detection. IEEE Autom Face Gesture Recognit 889–894
Pigou L et al (2014) Sign language recognition using convolutional neural networks, Workshop at the European Conference on Computer Vision. Springer, Cham
Sagawa H, et al. (2000) A method for recognizing a sequence of sign language words represented in a Japanese Sign Language sentence. Autom Face Gesture Recognit 434–439
Sharma S, Kumar K, Singh N (2020) Deep Eigen Space based ASL Recognition System, IETE Journal of Research
Srivastava N, et al. (2014) Dropout: A simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958
Uebersax D, et al. (2011) Real-time sign language letter and word recognition from depth data. Computer Vision Workshops IEEE International Conference on
Vogler C, Metaxas DN (2003) Handshapes and movements: Multiple-channel American Sign Language recognition. Gesture Workshop 247–258
Wang C, Shan S, Gao W (2002) An approach based on phonemes to large vocabulary Chinese Sign Language recognition. Autom Face Gesture Recognit 411–416
Yao G, Yao H, Liu X, Jiang F (2006) Real time large vocabulary continuous sign language recognition based on OP/viterbi algorithm. Int Conf Pattern Recognit 3:312–315
Yosinski J et al (2014) How transferable are features in deep neural networks?. In: Advances in neural information processing systems, pp 3320–3328
Zeiler MD et al (2013) On rectified linear units for speech processing. In: Proc. ICASSP
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Krishan Kumar is a Senior Member, IEEE
Rights and permissions
About this article
Cite this article
Sharma, S., Kumar, K. ASL-3DCNN: American sign language recognition technique using 3-D convolutional neural networks. Multimed Tools Appl 80, 26319–26331 (2021). https://doi.org/10.1007/s11042-021-10768-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-021-10768-5