Learning Syntactic and Dynamic Selective Encoding for Document Summarization,arXiv - CS - Computation and Language

当前位置： X-MOL 学术 › arXiv.cs.CL › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Learning Syntactic and Dynamic Selective Encoding for Document Summarization
arXiv - CS - Computation and Language Pub Date : 2020-03-25 , DOI: arxiv-2003.11173
Haiyang Xu, Yahao He, Kun Han, Junwen Chen and Xiangang Li

Text summarization aims to generate a headline or a short summary consisting of the major information of the source text. Recent studies employ the sequence-to-sequence framework to encode the input with a neural network and generate abstractive summary. However, most studies feed the encoder with the semantic word embedding but ignore the syntactic information of the text. Further, although previous studies proposed the selective gate to control the information flow from the encoder to the decoder, it is static during the decoding and cannot differentiate the information based on the decoder states. In this paper, we propose a novel neural architecture for document summarization. Our approach has the following contributions: first, we incorporate syntactic information such as constituency parsing trees into the encoding sequence to learn both the semantic and syntactic information from the document, resulting in more accurate summary; second, we propose a dynamic gate network to select the salient information based on the context of the decoder state, which is essential to document summarization. The proposed model has been evaluated on CNN/Daily Mail summarization datasets and the experimental results show that the proposed approach outperforms baseline approaches.

中文翻译：

学习用于文档摘要的句法和动态选择性编码

文本摘要旨在生成由源文本的主要信息组成的标题或简短摘要。最近的研究采用序列到序列框架来用神经网络对输入进行编码并生成抽象摘要。然而，大多数研究为编码器提供语义词嵌入，但忽略了文本的句法信息。此外，虽然之前的研究提出了选择门来控制从编码器到解码器的信息流，但它在解码过程中是静态的，无法根据解码器状态区分信息。在本文中，我们提出了一种用于文档摘要的新型神经架构。我们的方法有以下贡献：首先，我们将选区解析树等句法信息纳入编码序列，以从文档中学习语义和句法信息，从而得到更准确的总结；其次，我们提出了一个动态门网络，以根据解码器状态的上下文选择显着信息，这对于文档摘要至关重要。所提出的模型已经在 CNN/Daily Mail 摘要数据集上进行了评估，实验结果表明所提出的方法优于基线方法。

更新日期：2020-03-26

点击分享查看原文

点击收藏

阅读更多本刊最新论文