当前位置: X-MOL 学术Comput. Linguist. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Evaluating Computational Language Models with Scaling Properties of Natural Language
Computational Linguistics ( IF 9.3 ) Pub Date : 2019-09-01 , DOI: 10.1162/coli_a_00355
Shuntaro Takahashi 1 , Kumiko Tanaka-Ishii 2
Affiliation  

In this article, we evaluate computational models of natural language with respect to the universal statistical behaviors of natural language. Statistical mechanical analyses have revealed that natural language text is characterized by scaling properties, which quantify the global structure in the vocabulary population and the long memory of a text. We study whether five scaling properties (given by Zipf’s law, Heaps’s law, Ebeling’s method, Taylor’s law, and long-range correlation analysis) can serve for evaluation of computational models. Specifically, we test n-gram language models, a probabilistic context-free grammar (PCFG), language models based on Simon/Pitman-Yor processes, neural language models, and generative adversarial networks (GANs) for text generation. Our analysis reveals that language models based on recurrent neural networks (RNNs) with a gating mechanism (i.e., long short-term memory, LSTM; a gated recurrent unit, GRU; and quasi-recurrent neural networks, QRNNs) are the only computational models that can reproduce the long memory behavior of natural language. Furthermore, through comparison with recently proposed model-based evaluation methods, we find that the exponent of Taylor’s law is a good indicator of model quality.

中文翻译:

使用自然语言的缩放特性评估计算语言模型

在本文中,我们根据自然语言的通用统计行为来评估自然语言的计算模型。统计力学分析表明,自然语言文本具有缩放特性,可量化词汇群中的全局结构和文本的长记忆。我们研究了五种缩放特性(由 Zipf 定律、Heaps 定律、Ebeling 方法、泰勒定律和长期相关分析给出)是否可以用于计算模型的评估。具体来说,我们测试了 n-gram 语言模型、概率上下文无关语法 (PCFG)、基于 Simon/Pitman-Yor 过程的语言模型、神经语言模型和用于文本生成的生成对抗网络 (GAN)。我们的分析表明,基于循环神经网络 (RNN) 的语言模型具有门控机制(即长短期记忆,LSTM;门控循环单元,GRU;准循环神经网络,QRNN)是唯一的计算模型可以重现自然语言的长记忆行为。此外,通过与最近提出的基于模型的评估方法的比较,我们发现泰勒定律的指数是模型质量的一个很好的指标。
更新日期:2019-09-01
down
wechat
bug