当前位置: X-MOL 学术Artif. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Synchronous bidirectional inference for neural sequence generation
Artificial Intelligence ( IF 14.4 ) Pub Date : 2020-04-01 , DOI: 10.1016/j.artint.2020.103234
Jiajun Zhang , Long Zhou , Yang Zhao , Chengqing Zong

In sequence to sequence generation tasks (e.g. machine translation and abstractive summarization), inference is generally performed in a left-to-right manner to produce the result token by token. The neural approaches, such as LSTM and self-attention networks, are now able to make full use of all the predicted history hypotheses from left side during inference, but cannot meanwhile access any future (right side) information and usually generate unbalanced outputs in which left parts are much more accurate than right ones. In this work, we propose a synchronous bidirectional inference model to generate outputs using both left-to-right and right-to-left decoding simultaneously and interactively. First, we introduce a novel beam search algorithm that facilitates synchronous bidirectional decoding. Then, we present the core approach which enables left-to-right and right-to-left decoding to interact with each other, so as to utilize both the history and future predictions simultaneously during inference. We apply the proposed model to both LSTM and self-attention networks. In addition, we propose two strategies for parameter optimization. The extensive experiments on machine translation and abstractive summarization demonstrate that our synchronous bidirectional inference model can achieve remarkable improvements over the strong baselines.

中文翻译:

用于神经序列生成的同步双向推理

在序列到序列生成任务(例如机器翻译和抽象摘要)中,推理通常以从左到右的方式进行,以逐个标记生成结果标记。LSTM 和自注意力网络等神经方法现在能够在推理过程中充分利用左侧的所有预测历史假设,但不能同时访问任何未来(右侧)信息,并且通常会产生不平衡的输出,其中左边的部分比右边的要准确得多。在这项工作中,我们提出了一种同步双向推理模型,以同时交互地使用从左到右和从右到左解码来生成输出。首先,我们介绍了一种有助于同步双向解码的新型波束搜索算法。然后,我们提出了使从左到右和从右到左解码相互交互的核心方法,以便在推理过程中同时利用历史和未来预测。我们将提出的模型应用于 LSTM 和自注意力网络。此外,我们提出了两种参数优化策略。机器翻译和抽象摘要的大量实验表明,我们的同步双向推理模型可以在强基线上取得显着的改进。
更新日期:2020-04-01
down
wechat
bug