当前位置: X-MOL 学术arXiv.cs.CL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
LSBert: A Simple Framework for Lexical Simplification
arXiv - CS - Computation and Language Pub Date : 2020-06-25 , DOI: arxiv-2006.14939
Jipeng Qiang and Yun Li and Yi Zhu and Yunhao Yuan and Xindong Wu

Lexical simplification (LS) aims to replace complex words in a given sentence with their simpler alternatives of equivalent meaning, to simplify the sentence. Recently unsupervised lexical simplification approaches only rely on the complex word itself regardless of the given sentence to generate candidate substitutions, which will inevitably produce a large number of spurious candidates. In this paper, we propose a lexical simplification framework LSBert based on pretrained representation model Bert, that is capable of (1) making use of the wider context when both detecting the words in need of simplification and generating substitue candidates, and (2) taking five high-quality features into account for ranking candidates, including Bert prediction order, Bert-based language model, and the paraphrase database PPDB, in addition to the word frequency and word similarity commonly used in other LS methods. We show that our system outputs lexical simplifications that are grammatically correct and semantically appropriate, and obtains obvious improvement compared with these baselines, outperforming the state-of-the-art by 29.8 Accuracy points on three well-known benchmarks.

中文翻译:

LSBERT:词法简化的简单框架

词法简化 (LS) 旨在将给定句子中的复杂单词替换为具有等效含义的更简单替代词,以简化句子。最近的无监督词法简化方法只依赖复杂词本身而不考虑给定的句子来生成候选替换,这将不可避免地产生大量虚假候选。在本文中,我们提出了一个基于预训练表示模型 Bert 的词法简化框架 LSBert,它能够(1)在检测需要简化的单词和生成替代候选词时利用更广泛的上下文,以及(2)采用五个高质量特征用于对候选进行排名,包括 Bert 预测顺序、基于 Bert 的语言模型和复述数据库 PPDB,除了其他LS方法常用的词频和词相似度。我们表明,我们的系统输出语法正确且语义合适的词法简化,并且与这些基线相比获得了明显的改进,在三个众所周知的基准测试中以 29.8 个准确点的准确度优于最新技术。
更新日期:2020-06-29
down
wechat
bug