当前位置: X-MOL 学术Speech Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Development of a hybrid word recognition system and dataset for the Azerbaijani Sign Language dactyl alphabet
Speech Communication ( IF 3.2 ) Pub Date : 2023-07-21 , DOI: 10.1016/j.specom.2023.102960
Jamaladdin Hasanov , Nigar Alishzade , Aykhan Nazimzade , Samir Dadashzade , Toghrul Tahirov

The paper introduces a real-time fingerspelling-to-text translation system for the Azerbaijani Sign Language (AzSL), targeted to the clarification of the words with no available or ambiguous signs. The system consists of both statistical and probabilistic models, used in the sign recognition and sequence generation phases. Linguistic, technical, and human–computer interaction-related challenges, which are usually not considered in publicly available sign-based recognition application programming interfaces and tools, are addressed in this study. The specifics of the AzSL are reviewed, feature selection strategies are evaluated, and a robust model for the translation of hand signs is suggested. The two-stage recognition model exhibits high accuracy during real-time inference. Considering the lack of a publicly available dataset with the benchmark, a new, comprehensive AzSL dataset consisting of 13,444 samples collected by 221 volunteers is described and made publicly available for the sign language recognition community. To extend the dataset and make the model robust to changes, augmentation methods and their effect on the performance are analyzed. A lexicon-based validation method used for the probabilistic analysis and candidate word selection enhances the probability of the recognized phrases. Experiments delivered 94% accuracy on the test dataset, which was close to the real-time user experience. The dataset and implemented software are shared in a public repository for review and further research (CeDAR, 2021; Alishzade et al., 2022). The work has been presented at TeknoFest 2022 and ranked as the first in the category of social-oriented technologies.



中文翻译:

开发阿塞拜疆手语短字母混合文字识别系统和数据集

该论文介绍了一种阿塞拜疆手语 (AzSL) 的实时手指拼写到文本翻译系统,旨在澄清没有可用或模棱两可的手语的单词。该系统由统计模型和概率模型组成,用于符号识别和序列生成阶段。语言、技术和人机交互本研究解决了公开的基于标志的识别应用程序编程接口和工具中通常不考虑的相关挑战。回顾了 AzSL 的细节,评估了特征选择策略,并提出了一个用于手势翻译的稳健模型。两阶段识别模型在实时推理过程中表现出很高的准确性。考虑到缺乏具有基准的公开数据集,我们描述了一个新的、全面的 AzSL 数据集,该数据集由 221 名志愿者收集的 13,444 个样本组成,并向手语识别社区公开提供。为了扩展数据集并使模型对变化具有鲁棒性,我们分析了增强方法及其对性能的影响。用于概率分析和候选词选择的基于词典的验证方法提高了识别短语的概率。实验在测试数据集上的准确率达到 94%,接近实时用户体验。数据集和实施的软件在公共存储库中共享,以供审查和进一步研究(CeDAR,2021;Alishzade 等人,2022)。该作品已在 TeknoFest 2022 上展出,并排名第一面向社会的技术

更新日期:2023-07-21
down
wechat
bug