A Token-wise CNN-based Method for Sentence Compression,arXiv - CS - Computation and Language

当前位置： X-MOL 学术 › arXiv.cs.CL › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Token-wise CNN-based Method for Sentence Compression
arXiv - CS - Computation and Language Pub Date : 2020-09-23 , DOI: arxiv-2009.11260
Weiwei Hou, Hanna Suominen, Piotr Koniusz, Sabrina Caldwell and Tom Gedeon

Sentence compression is a Natural Language Processing (NLP) task aimed at shortening original sentences and preserving their key information. Its applications can benefit many fields e.g. one can build tools for language education. However, current methods are largely based on Recurrent Neural Network (RNN) models which suffer from poor processing speed. To address this issue, in this paper, we propose a token-wise Convolutional Neural Network, a CNN-based model along with pre-trained Bidirectional Encoder Representations from Transformers (BERT) features for deletion-based sentence compression. We also compare our model with RNN-based models and fine-tuned BERT. Although one of the RNN-based models outperforms marginally other models given the same input, our CNN-based model was ten times faster than the RNN-based approach.

中文翻译：

一种基于 CNN 的句子压缩方法

句子压缩是一项自然语言处理 (NLP) 任务，旨在缩短原始句子并保留其关键信息。它的应用可以使许多领域受益，例如可以构建语言教育工具。然而，当前的方法主要基于循环神经网络 (RNN) 模型，其处理速度很差。为了解决这个问题，在本文中，我们提出了一种基于标记的卷积神经网络，一种基于 CNN 的模型以及来自 Transformers 的预训练双向编码器表示 (BERT) 特征，用于基于删除的句子压缩。我们还将我们的模型与基于 RNN 的模型和微调的 BERT 进行比较。尽管在给定相同输入的情况下，基于 RNN 的模型之一略胜于其他模型，但我们基于 CNN 的模型比基于 RNN 的方法快十倍。

更新日期：2020-09-24

点击分享查看原文

点击收藏

阅读更多本刊最新论文