当前位置: X-MOL 学术Journal of Quantitative Linguistics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Why Do Parameter Values in the Zipf-Mandelbrot Distribution Sometimes Explode?
Journal of Quantitative Linguistics ( IF 0.761 ) Pub Date : 2021-02-23 , DOI: 10.1080/09296174.2021.1887613
Ján Mačutek 1, 2
Affiliation  

ABSTRACT

The Zipf-Mandelbrot distribution serves as a mathematical model for ranked frequencies in many areas of scientific research, including linguistics. Many linguistic units, like e.g., words or word n-grams, follow this distribution. However, in some cases, such as for graphemes in linguistics or species abundance and diversity data in biology, the parameters of the Zipf-Mandelbrot distribution are virtually uninterpretable, as their values strongly depend on the precision of numerical methods used to estimate them (values from several tens to several hundreds are not uncommon). It is shown in the paper that these values can be explained by the convergence to the geometric distribution, which forces both parameters of the Zipf-Mandelbrot distribution to increase to infinity while their ratio converges to a constant. Some examples which illustrate this limit behaviour are presented.



中文翻译:

为什么 Zipf-Mandelbrot 分布中的参数值有时会爆炸?

摘要

Zipf-Mandelbrot 分布在许多科学研究领域(包括语言学)中用作排名频率的数学模型。许多语言单位,例如单词或单词 n-gram,都遵循这种分布。然而,在某些情况下,例如语言学中的字形或生物学中的物种丰度和多样性数据,Zipf-Mandelbrot 分布的参数实际上是无法解释的,因为它们的值很大程度上取决于用于估计它们的数值方法的精度(值几十到几百的情况并不少见)。论文表明,这些值可以通过收敛到几何分布来解释,这迫使 Zipf-Mandelbrot 分布的两个参数增加到无穷大,而它们的比率收敛到一个常数。

更新日期:2021-02-23
down
wechat
bug