当前位置:
X-MOL 学术
›
arXiv.cs.CL
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
The NiuTrans System for the WMT21 Efficiency Task
arXiv - CS - Computation and Language Pub Date : 2021-09-16 , DOI: arxiv-2109.08003 Chenglong Wang, Chi Hu, Yongyu Mu, Zhongxiang Yan, Siming Wu, Minyi Hu, Hang Cao, Bei Li, Ye Lin, Tong Xiao, Jingbo Zhu
arXiv - CS - Computation and Language Pub Date : 2021-09-16 , DOI: arxiv-2109.08003 Chenglong Wang, Chi Hu, Yongyu Mu, Zhongxiang Yan, Siming Wu, Minyi Hu, Hang Cao, Bei Li, Ye Lin, Tong Xiao, Jingbo Zhu
This paper describes the NiuTrans system for the WMT21 translation efficiency
task (http://statmt.org/wmt21/efficiency-task.html). Following last year's
work, we explore various techniques to improve efficiency while maintaining
translation quality. We investigate the combinations of lightweight Transformer
architectures and knowledge distillation strategies. Also, we improve the
translation efficiency with graph optimization, low precision, dynamic
batching, and parallel pre/post-processing. Our system can translate 247,000
words per second on an NVIDIA A100, being 3$\times$ faster than last year's
system. Our system is the fastest and has the lowest memory consumption on the
GPU-throughput track. The code, model, and pipeline will be available at
NiuTrans.NMT (https://github.com/NiuTrans/NiuTrans.NMT).
中文翻译:
用于 WMT21 效率任务的 NiuTrans 系统
本文介绍了用于 WMT21 翻译效率任务的 NiuTrans 系统(http://statmt.org/wmt21/efficiency-task.html)。继去年的工作之后,我们探索了各种技术来提高效率,同时保持翻译质量。我们研究了轻量级 Transformer 架构和知识蒸馏策略的组合。此外,我们通过图优化、低精度、动态批处理和并行预处理/后处理来提高翻译效率。我们的系统在 NVIDIA A100 上每秒可以翻译 247,000 个单词,比去年的系统快 3 倍。我们的系统是最快的,并且在 GPU 吞吐量轨道上具有最低的内存消耗。代码、模型和管道将在 NiuTrans.NMT (https://github.com/NiuTrans/NiuTrans.NMT) 上提供。
更新日期:2021-09-17
中文翻译:
用于 WMT21 效率任务的 NiuTrans 系统
本文介绍了用于 WMT21 翻译效率任务的 NiuTrans 系统(http://statmt.org/wmt21/efficiency-task.html)。继去年的工作之后,我们探索了各种技术来提高效率,同时保持翻译质量。我们研究了轻量级 Transformer 架构和知识蒸馏策略的组合。此外,我们通过图优化、低精度、动态批处理和并行预处理/后处理来提高翻译效率。我们的系统在 NVIDIA A100 上每秒可以翻译 247,000 个单词,比去年的系统快 3 倍。我们的系统是最快的,并且在 GPU 吞吐量轨道上具有最低的内存消耗。代码、模型和管道将在 NiuTrans.NMT (https://github.com/NiuTrans/NiuTrans.NMT) 上提供。