当前位置: X-MOL 学术Int. J. Comput. Vis. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Deep Sign: Enabling Robust Statistical Continuous Sign Language Recognition via Hybrid CNN-HMMs
International Journal of Computer Vision ( IF 19.5 ) Pub Date : 2018-10-05 , DOI: 10.1007/s11263-018-1121-3
Oscar Koller , Sepehr Zargaran , Hermann Ney , Richard Bowden

This manuscript introduces the end-to-end embedding of a CNN into a HMM, while interpreting the outputs of the CNN in a Bayesian framework. The hybrid CNN-HMM combines the strong discriminative abilities of CNNs with the sequence modelling capabilities of HMMs. Most current approaches in the field of gesture and sign language recognition disregard the necessity of dealing with sequence data both for training and evaluation. With our presented end-to-end embedding we are able to improve over the state-of-the-art on three challenging benchmark continuous sign language recognition tasks by between 15 and 38% relative reduction in word error rate and up to 20% absolute. We analyse the effect of the CNN structure, network pretraining and number of hidden states. We compare the hybrid modelling to a tandem approach and evaluate the gain of model combination.

中文翻译:

Deep Sign:通过混合 CNN-HMM 实现稳健的统计连续手语识别

这份手稿介绍了将 CNN 端到端嵌入到 HMM 中,同时在贝叶斯框架中解释 CNN 的输出。混合 CNN-HMM 结合了 CNN 的强大判别能力和 HMM 的序列建模能力。手势和手语识别领域的大多数当前方法都忽略了处理序列数据以进行训练和评估的必要性。通过我们提出的端到端嵌入,我们能够在三个具有挑战性的基准连续手语识别任务上改进最先进的技术,字错误率相对降低 15% 到 38%,绝对错误率最高可达 20% . 我们分析了 CNN 结构、网络预训练和隐藏状态数量的影响。
更新日期:2018-10-05
down
wechat
bug