当前位置: X-MOL 学术arXiv.cs.CL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The NiuTrans System for the WMT21 Efficiency Task
arXiv - CS - Computation and Language Pub Date : 2021-09-16 , DOI: arxiv-2109.08003
Chenglong Wang, Chi Hu, Yongyu Mu, Zhongxiang Yan, Siming Wu, Minyi Hu, Hang Cao, Bei Li, Ye Lin, Tong Xiao, Jingbo Zhu

This paper describes the NiuTrans system for the WMT21 translation efficiency task (http://statmt.org/wmt21/efficiency-task.html). Following last year's work, we explore various techniques to improve efficiency while maintaining translation quality. We investigate the combinations of lightweight Transformer architectures and knowledge distillation strategies. Also, we improve the translation efficiency with graph optimization, low precision, dynamic batching, and parallel pre/post-processing. Our system can translate 247,000 words per second on an NVIDIA A100, being 3$\times$ faster than last year's system. Our system is the fastest and has the lowest memory consumption on the GPU-throughput track. The code, model, and pipeline will be available at NiuTrans.NMT (https://github.com/NiuTrans/NiuTrans.NMT).

中文翻译:

用于 WMT21 效率任务的 NiuTrans 系统

本文介绍了用于 WMT21 翻译效率任务的 NiuTrans 系统(http://statmt.org/wmt21/efficiency-task.html)。继去年的工作之后,我们探索了各种技术来提高效率,同时保持翻译质量。我们研究了轻量级 Transformer 架构和知识蒸馏策略的组合。此外,我们通过图优化、低精度、动态批处理和并行预处理/后处理来提高翻译效率。我们的系统在 NVIDIA A100 上每秒可以翻译 247,000 个单词,比去年的系统快 3 倍。我们的系统是最快的,并且在 GPU 吞吐量轨道上具有最低的内存消耗。代码、模型和管道将在 NiuTrans.NMT (https://github.com/NiuTrans/NiuTrans.NMT) 上提供。
更新日期:2021-09-17
down
wechat
bug