Skip to main content
Log in

ASL-3DCNN: American sign language recognition technique using 3-D convolutional neural networks

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The communication between a person from the impaired community with a person who does not understand sign language could be a tedious task. Sign language is the art of conveying messages using hand gestures. Recognition of dynamic hand gestures in American Sign Language (ASL) became a very important challenge that is still unresolved. In order to resolve the challenges of dynamic ASL recognition, a more advanced successor of the Convolutional Neural Networks (CNNs) called 3-D CNNs is employed, which can recognize the patterns in volumetric data like videos. The CNN is trained for classification of 100 words on Boston ASL (Lexicon Video Dataset) LVD dataset with more than 3300 English words signed by 6 different signers. 70% of the dataset is used for Training while the remaining 30% dataset is used for testing the model. The proposed work outperforms the existing state-of-art models in terms of precision (3.7%), recall (4.3%), and f-measure (3.9%). The computing time (0.19 seconds per frame) of the proposed work shows that the proposal may be used in real-time applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. http://www.bu.edu/av/asllrp/dai-asllvd.html

References

  1. Ameen S, Sunil V (2017) A convolutional neural network to classify American Sign Language fingerspelling from depth and colour images, Expert Systems

  2. Athitsos V et al (2008) The american sign language lexicon video dataset, Computer Vision and Pattern Recognition Workshops, IEEE Computer Society Conference on

  3. Cheng WT, Sun Y, Li GF, Jiang GZ, Liu HH (2019) Jointly network: A network based on CNN and RBM for gesture recognition. Neural Comput Appl 31(Suppl 1):309–323

    Article  Google Scholar 

  4. Cui Y, Juyang W (2000) Appearance-based hand sign recognition from intensity image sequences. Comput Vision Image Understand 78.2:157–176

    Article  Google Scholar 

  5. Gao W, Fang G, Zhao D, Chen Y (2004) Transition movement models for large vocabulary continuous sign language recognition. Autom Face Gesture Recognit 553–558

  6. He Y, Li GF, Liao YJ, Sun Y, Kong JY, Jiang GZ, Jiang D, Liu HH (2019) Gesture recognition based on an improved local sparse representation classification algorithm. Clust Comput 22(Suppl 5):10935–10946

    Article  Google Scholar 

  7. Isaacs J, Foo S (2004) Hand pose estimation for american sign language recognition, System Theory, 2004. In: Proceedings of the thirty-sixth southeastern symposium on. IEEE, pp 132–136

  8. Hinton G, Osindero S, Teh Y (2005) A fast learning algorithm for deep belief nets. Neural Comput 18:1527–1554

    Article  MathSciNet  Google Scholar 

  9. Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313:504–507

    Article  MathSciNet  Google Scholar 

  10. Kang B, Tripathi S, Nguyen TQ (2015) Real-time sign language fingerspelling recognition using convolutional neural networks from depth map. In: Pattern recognition (ACPR), 3rd IAPR asian conference on. IEEE

  11. Kingma D, Ba J (2014) Adam: A method for stochastic optimization, arXiv:1412.6980

  12. Koller O et al (2016) Deep sign: Hybrid CNN-HMM for continuous sign language recognition. Proc British Machine Vision Conf 1–6

  13. Koller O et al (2019) Weakly supervised learning with multi-stream CNN-LSTM-HMMs to discover sequential parallelism in sign language videos. IEEE Trans Pattern Anal Machine Intell

  14. Kumar K, Shrimankar D (2017) F-DES: Fast and deep event summarization. IEEE Trans Multimed 20(2):323–334

    Article  Google Scholar 

  15. Lecun Y, Bengio Y (1995) Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, vol 3361

  16. Lecun Y, Bengio Y, Lhinton G (2015) Deep learning. Nature 521:436–444

    Article  Google Scholar 

  17. Lecun Y, Boser B, Denker GE, Henderson D, Howard RE, Hubbard W, et al. (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1:541–551

    Article  Google Scholar 

  18. Lecun Y, Boser B, Denker JS, Howard RE, Hubbard W, Jackel LD, Henderson D (1990) Handwritten digit recognition with a back-propagation network. Adv Neural Inform Process Syst 396–404

  19. Lecun Y, Bottou L, Orr G, Müller K-R (1989) Efficient BackProp. In: Orr G, Müller K-R (eds) Neural networks: Tricks of the trade, vol 1524. Springer, Berlin, pp 9–50

  20. Lecun Y, Galland CC, Hinton GE (1988) GEMINI: Gradient Estimation through matrix inversion after noise injection. InNIPS 141–148

  21. Lecun Y, Jackel L, Boser B, Denker J, Graf H, Guyon I, et al. (1990) Handwritten Digit recognition: Applications of neural net chips and automatic learning, Neurocomputing. Springer, Berlin, pp 303–318

    Google Scholar 

  22. Li Y, Hailong H, Zhangqian Z, Gang Z (2020) SCANet: Sensor-based continuous authentication with two-stream convolutional neural networks. ACM Trans Sensor Netw (TOSN) 16(3):1–27

    Article  Google Scholar 

  23. Li GF, Jiang D, Zhou YL, Jiang GZ, Kong JY, Manogaran G (2019) Human lesion detection method based on image information and brain signal. IEEE Access 7:11533–11542

    Article  Google Scholar 

  24. Li GF, Tang H, Sun Y, Kong JY, Jiang GZ, Jiang D, Tao B, Xu S, Liu HH (2019) Hand gesture recognition based on convolution neural network. Clust Comput 22(Suppl 2):2719–2729

    Article  Google Scholar 

  25. Liang Z-J, Liao S-B, Hu B-Z (2018) 3D convolutional neural networks for dynamic sign language recognition. Comput J 61.11:1724–1736

    Article  Google Scholar 

  26. Liwicki S, Everingham M (2009) Automatic recognition of fingerspelled words in british sign language. In: Computer vision and pattern recognition workshops IEEE Computer Society Conference on, pp 50–57

  27. Ma Y, Gang Z, Shuangquan W, Hongyang Z, Woosub J (2018) SignFi: Sign language recognition using WiFi. Proc ACM on Interact Mob Wearable Ubiquitous Technol 2(1):1–21

    Article  Google Scholar 

  28. Ma J et al (2000) A continuous chinese sign language recognition system. Automat Face Gesture Recognit 428–433

  29. Negin F et al (2018) PRAXIS: Towards automatic cognitive assessment using gesture recognition. Expert Syst Appl 106:21–35

    Article  Google Scholar 

  30. Ong E-J et al (2004) A boosted classifier tree for hand shape detection. IEEE Autom Face Gesture Recognit 889–894

  31. Pigou L et al (2014) Sign language recognition using convolutional neural networks, Workshop at the European Conference on Computer Vision. Springer, Cham

    Google Scholar 

  32. Sagawa H, et al. (2000) A method for recognizing a sequence of sign language words represented in a Japanese Sign Language sentence. Autom Face Gesture Recognit 434–439

  33. Sharma S, Kumar K, Singh N (2020) Deep Eigen Space based ASL Recognition System, IETE Journal of Research

  34. Srivastava N, et al. (2014) Dropout: A simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958

    MathSciNet  MATH  Google Scholar 

  35. Uebersax D, et al. (2011) Real-time sign language letter and word recognition from depth data. Computer Vision Workshops IEEE International Conference on

  36. Vogler C, Metaxas DN (2003) Handshapes and movements: Multiple-channel American Sign Language recognition. Gesture Workshop 247–258

  37. Wang C, Shan S, Gao W (2002) An approach based on phonemes to large vocabulary Chinese Sign Language recognition. Autom Face Gesture Recognit 411–416

  38. Yao G, Yao H, Liu X, Jiang F (2006) Real time large vocabulary continuous sign language recognition based on OP/viterbi algorithm. Int Conf Pattern Recognit 3:312–315

    Google Scholar 

  39. Yosinski J et al (2014) How transferable are features in deep neural networks?. In: Advances in neural information processing systems, pp 3320–3328

  40. Zeiler MD et al (2013) On rectified linear units for speech processing. In: Proc. ICASSP

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Krishan Kumar.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Krishan Kumar is a Senior Member, IEEE

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sharma, S., Kumar, K. ASL-3DCNN: American sign language recognition technique using 3-D convolutional neural networks. Multimed Tools Appl 80, 26319–26331 (2021). https://doi.org/10.1007/s11042-021-10768-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-021-10768-5

Keywords

Navigation