SDST: Successive Decoding for Speech-to-text Translation,arXiv - CS - Computation and Language

当前位置： X-MOL 学术 › arXiv.cs.CL › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

SDST: Successive Decoding for Speech-to-text Translation
arXiv - CS - Computation and Language Pub Date : 2020-09-21 , DOI: arxiv-2009.09737
Qianqian Dong, Mingxuan Wang, Hao Zhou, Shuang Xu, Bo Xu, Lei Li

End-to-end speech-to-text translation (ST), which directly translates the source language speech to the target language text, has attracted intensive attention recently. However, the combination of speech recognition and machine translation in a single model poses a heavy burden on the direct cross-modal cross-lingual mapping. To reduce the learning difficulty, we propose SDST, an integral framework with \textbf{S}uccessive \textbf{D}ecoding for end-to-end \textbf{S}peech-to-text \textbf{T}ranslation task. This method is verified in two mainstream datasets. Experiments show that our proposed \method improves the previous state-of-the-art methods by big margins.

中文翻译：

SDST：语音到文本翻译的连续解码

将源语言语音直接翻译成目标语言文本的端到端语音到文本翻译（ST）最近引起了广泛关注。然而，语音识别和机器翻译在单一模型中的结合给直接跨模态跨语言映射带来了沉重的负担。为了降低学习难度，我们提出了 SDST，这是一个带有 \textbf{S}uccessive \textbf{D} 编码的完整框架，用于端到端 \textbf{S}peech-to-text \textbf{T} 翻译任务。该方法在两个主流数据集中得到验证。实验表明，我们提出的 \method 大大改进了以前的最先进方法。

更新日期：2020-09-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文