FST: the FAIR Speech Translation System for the IWSLT21 Multilingual Shared Task,arXiv - CS - Sound

当前位置： X-MOL 学术 › arXiv.cs.SD › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

FST: the FAIR Speech Translation System for the IWSLT21 Multilingual Shared Task
arXiv - CS - Sound Pub Date : 2021-07-14 , DOI: arxiv-2107.06959
Yun Tang, Hongyu Gong, Xian Li, Changhan Wang, Juan Pino, Holger Schwenk, Naman Goyal

In this paper, we describe our end-to-end multilingual speech translation system submitted to the IWSLT 2021 evaluation campaign on the Multilingual Speech Translation shared task. Our system is built by leveraging transfer learning across modalities, tasks and languages. First, we leverage general-purpose multilingual modules pretrained with large amounts of unlabelled and labelled data. We further enable knowledge transfer from the text task to the speech task by training two tasks jointly. Finally, our multilingual model is finetuned on speech translation task-specific data to achieve the best translation results. Experimental results show our system outperforms the reported systems, including both end-to-end and cascaded based approaches, by a large margin. In some translation directions, our speech translation results evaluated on the public Multilingual TEDx test set are even comparable with the ones from a strong text-to-text translation system, which uses the oracle speech transcripts as input.

中文翻译：

FST：IWSLT21 多语言共享任务的 FAIR 语音翻译系统

在本文中，我们描述了我们提交给 IWSLT 2021 多语言语音翻译共享任务评估活动的端到端多语言语音翻译系统。我们的系统是通过利用跨模式、任务和语言的迁移学习来构建的。首先，我们利用经过大量未标记和标记数据预训练的通用多语言模块。我们通过联合训练两个任务进一步实现了从文本任务到语音任务的知识转移。最后，我们的多语言模型针对特定于语音翻译任务的数据进行了微调，以实现最佳翻译结果。实验结果表明，我们的系统大大优于报告的系统，包括端到端和基于级联的方法。在一些翻译方向，

更新日期：2021-07-16

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>