当前位置:
X-MOL 学术
›
arXiv.cs.SD
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
FST: the FAIR Speech Translation System for the IWSLT21 Multilingual Shared Task
arXiv - CS - Sound Pub Date : 2021-07-14 , DOI: arxiv-2107.06959 Yun Tang, Hongyu Gong, Xian Li, Changhan Wang, Juan Pino, Holger Schwenk, Naman Goyal
arXiv - CS - Sound Pub Date : 2021-07-14 , DOI: arxiv-2107.06959 Yun Tang, Hongyu Gong, Xian Li, Changhan Wang, Juan Pino, Holger Schwenk, Naman Goyal
In this paper, we describe our end-to-end multilingual speech translation
system submitted to the IWSLT 2021 evaluation campaign on the Multilingual
Speech Translation shared task. Our system is built by leveraging transfer
learning across modalities, tasks and languages. First, we leverage
general-purpose multilingual modules pretrained with large amounts of
unlabelled and labelled data. We further enable knowledge transfer from the
text task to the speech task by training two tasks jointly. Finally, our
multilingual model is finetuned on speech translation task-specific data to
achieve the best translation results. Experimental results show our system
outperforms the reported systems, including both end-to-end and cascaded based
approaches, by a large margin. In some translation directions, our speech translation results evaluated on
the public Multilingual TEDx test set are even comparable with the ones from a
strong text-to-text translation system, which uses the oracle speech
transcripts as input.
中文翻译:
FST:IWSLT21 多语言共享任务的 FAIR 语音翻译系统
在本文中,我们描述了我们提交给 IWSLT 2021 多语言语音翻译共享任务评估活动的端到端多语言语音翻译系统。我们的系统是通过利用跨模式、任务和语言的迁移学习来构建的。首先,我们利用经过大量未标记和标记数据预训练的通用多语言模块。我们通过联合训练两个任务进一步实现了从文本任务到语音任务的知识转移。最后,我们的多语言模型针对特定于语音翻译任务的数据进行了微调,以实现最佳翻译结果。实验结果表明,我们的系统大大优于报告的系统,包括端到端和基于级联的方法。在一些翻译方向,
更新日期:2021-07-16
中文翻译:
FST:IWSLT21 多语言共享任务的 FAIR 语音翻译系统
在本文中,我们描述了我们提交给 IWSLT 2021 多语言语音翻译共享任务评估活动的端到端多语言语音翻译系统。我们的系统是通过利用跨模式、任务和语言的迁移学习来构建的。首先,我们利用经过大量未标记和标记数据预训练的通用多语言模块。我们通过联合训练两个任务进一步实现了从文本任务到语音任务的知识转移。最后,我们的多语言模型针对特定于语音翻译任务的数据进行了微调,以实现最佳翻译结果。实验结果表明,我们的系统大大优于报告的系统,包括端到端和基于级联的方法。在一些翻译方向,