An Improved Sign Language Translation Model with Explainable Adaptations for Processing Long Sign Sentences,Computational Intelligence and Neuroscience

当前位置： X-MOL 学术 › Comput. Intell. Neurosci. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

An Improved Sign Language Translation Model with Explainable Adaptations for Processing Long Sign Sentences
Computational Intelligence and Neuroscience ( IF 3.120 ) Pub Date : 2020-10-24 , DOI: 10.1155/2020/8816125
Jiangbin Zheng ₁ , Zheng Zhao ₂ , Min Chen ₂ , Jing Chen ₂ , Chong Wu ₃ , Yidong Chen ₁ , Xiaodong Shi ₁ , Yiqi Tong ₁

Affiliation

Sign language translation (SLT) is an important application to bridge the communication gap between deaf and hearing people. In recent years, the research on the SLT based on neural translation frameworks has attracted wide attention. Despite the progress, current SLT research is still in the initial stage. In fact, current systems perform poorly in processing long sign sentences, which often involve long-distance dependencies and require large resource consumption. To tackle this problem, we propose two explainable adaptations to the traditional neural SLT models using optimized tokenization-related modules. First, we introduce a frame stream density compression (FSDC) algorithm for detecting and reducing the redundant similar frames, which effectively shortens the long sign sentences without losing information. Then, we replace the traditional encoder in a neural machine translation (NMT) module with an improved architecture, which incorporates a temporal convolution (T-Conv) unit and a dynamic hierarchical bidirectional GRU (DH-BiGRU) unit sequentially. The improved component takes the temporal tokenization information into consideration to extract deeper information with reasonable resource consumption. Our experiments on the RWTH-PHOENIX-Weather 2014T dataset show that the proposed model outperforms the state-of-the-art baseline up to about 1.5+ BLEU-4 score gains.

中文翻译：

一种可解释的适应性改进的手语翻译模型，用于处理长手语句子

手语翻译（SLT）是弥合聋人和听力障碍者之间沟通鸿沟的重要应用。近年来，基于神经翻译框架的SLT的研究引起了广泛的关注。尽管取得了进展，但当前的SLT研究仍处于初期阶段。实际上，当前的系统在处理长符号语句方面表现不佳，长符号语句通常涉及长距离依赖性，并且需要大量资源消耗。为了解决此问题，我们使用优化的标记化相关模块对传统的神经SLT模型提出了两种可解释的改进方案。首先，我们介绍帧流密度压缩（FSDC）算法，用于检测和减少冗余的相似帧，从而有效地缩短了长符号语句，而不会丢失信息。然后，我们用改进的体系结构替换了神经机器翻译（NMT）模块中的传统编码器，该体系结构依次包含了时间卷积（T-Conv）单元和动态分层双向GRU（DH-BiGRU）单元。改进的组件考虑了时间标记化信息，以合理的资源消耗提取更深的信息。我们在RWTH-PHOENIX-Weather 2014T数据集上进行的实验表明，所提出的模型在最高达1.5+ BLEU-4得分上的表现优于最先进的基线。

更新日期：2020-10-30

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>