当前位置: X-MOL 学术Int. J. Mod. Phys. C › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Are all the word ranking methods the same?
International Journal of Modern Physics C ( IF 1.5 ) Pub Date : 2021-06-12 , DOI: 10.1142/s0129183121501448
Elham Najafi 1 , Alireza Valizadeh 1 , Amir H. Darooneh 2, 3
Affiliation  

Text as a complex system is commonly studied by various methods, like complex networks or time series analysis, in order to discover its properties. One of the most important properties of each text is its keywords, which are extracted by word ranking methods. There are various methods to rank words of a text. Each method differently ranks words according to their frequency, spatial distribution or other word properties. Here, we aimed to explore how similar various word ranking methods are. For this purpose, we studied the rank correlation of some important word ranking methods for number of sample texts with different subjects and text sizes. We found that by increasing text size the correlation between ranking methods grows. It means that as the text size increases, the associated word ranks calculated by different ranking methods converge. Also, we found out that the rank correlations of word ranking methods approach their maximum value in the case of large enough texts.

中文翻译:

所有的单词排名方法都一样吗?

作为一个复杂系统的文本通常通过各种方法进行研究,例如复杂网络或时间序列分析,以发现其属性。每个文本最重要的属性之一是其关键字,这些关键字是通过单词排名方法提取的。有多种方法可以对文本的单词进行排序。每种方法根据单词的频率、空间分布或其他单词属性对单词进行不同的排序。在这里,我们旨在探索各种单词排名方法的相似程度。为此,我们研究了一些重要的词排序方法对不同主题和文本大小的样本文本数量的排序相关性。我们发现,通过增加文本大小,排名方法之间的相关性会增加。这意味着随着文本大小的增加,不同排序方法计算的关联词排序会收敛。
更新日期:2021-06-12
down
wechat
bug