当前位置: X-MOL 学术Multimed. Tools Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
ASL-3DCNN: American sign language recognition technique using 3-D convolutional neural networks
Multimedia Tools and Applications ( IF 3.0 ) Pub Date : 2021-05-01 , DOI: 10.1007/s11042-021-10768-5
Shikhar Sharma , Krishan Kumar

The communication between a person from the impaired community with a person who does not understand sign language could be a tedious task. Sign language is the art of conveying messages using hand gestures. Recognition of dynamic hand gestures in American Sign Language (ASL) became a very important challenge that is still unresolved. In order to resolve the challenges of dynamic ASL recognition, a more advanced successor of the Convolutional Neural Networks (CNNs) called 3-D CNNs is employed, which can recognize the patterns in volumetric data like videos. The CNN is trained for classification of 100 words on Boston ASL (Lexicon Video Dataset) LVD dataset with more than 3300 English words signed by 6 different signers. 70% of the dataset is used for Training while the remaining 30% dataset is used for testing the model. The proposed work outperforms the existing state-of-art models in terms of precision (3.7%), recall (4.3%), and f-measure (3.9%). The computing time (0.19 seconds per frame) of the proposed work shows that the proposal may be used in real-time applications.



中文翻译:

ASL-3DCNN:使用3-D卷积神经网络的美国手语识别技术

弱势群体的人与不懂手语的人之间的交流可能是一项繁琐的任务。手语是使用手势传达信息的艺术。美国手语(ASL)中动态手势的识别成为一个非常重要的挑战,目前尚未解决。为了解决动态ASL识别的挑战,采用了称为3-D CNN的卷积神经网络(CNN)的更高级后继产品,它可以识别体积数据(如视频)中的模式。CNN经过训练,可以在Boston ASL(Lexicon视频数据集)LVD数据集上对100个单词进行分类,其中包含6个不同的签名者签名的3300多个英语单词。数据集的70%用于训练,其余30%的数据集用于测试模型。拟议的工作在精度(3.7%),召回率(4.3%)和f量度(3.9%)方面优于现有的最新模型。拟议工作的计算时间(每帧0.19秒)表明,该提议可用于实时应用程序。

更新日期:2021-05-02
down
wechat
bug