当前位置: X-MOL 学术J. Informetr. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Words ranking and Hirsch index for identifying the core of the hapaxes in political texts
Journal of Informetrics ( IF 3.4 ) Pub Date : 2020-05-31 , DOI: 10.1016/j.joi.2020.101054
Valerio Ficcadenti , Roy Cerqueti , Marcel Ausloos , Gurjeet Dhesi

This paper deals with a quantitative analysis of the content of official political speeches. We study a set of about one thousand talks pronounced by the US Presidents, ranging from Washington to Trump. In particular, we search for the relevance of the rare words, i.e. those said only once in each speech – the so-called hapaxes. We implement a rank-size procedure of Zipf–Mandelbrot type for discussing the hapaxes’ frequencies regularity over the overall set of speeches. Starting from the obtained rank-size law, we define and detect the core of the hapaxes set by means of a procedure based on an Hirsch index variant. We discuss the resulting list of words in the light of the overall US Presidents’ speeches. We further show that this core of hapaxes itself can be well fitted through a Zipf–Mandelbrot law and that contains elements producing deviations at the low ranks between scatter plots and fitted curve – the so-called king and vice-roy effect. Some socio-political insights are derived from the obtained findings about the US Presidents messages.



中文翻译:

单词排名和Hirsch索引,用于识别政治文本中hapaxes的核心

本文对官方政治演讲的内容进行定量分析。我们研究了美国总统宣布的一系列大约1000场演讲,涉及华盛顿到特朗普。尤其是,我们搜索稀有单词的相关性,即在每个语音中只说过一次的单词-所谓的hapaxes。我们实现了Zipf–Mandelbrot类型的秩大小过程,用于讨论整个语音集合中hapax的频率规律性。从获得的秩大小定律开始,我们定义并检测hapaxes核心通过基于Hirsch索引变体的过程进行设置。我们将根据美国总统的总体讲话来讨论最终的词汇表。我们进一步证明,hap谱核心本身可以通过Zipf-Mandelbrot定律很好地拟合,并且包含在散点图和拟合曲线之间的低阶上产生偏差的元素-所谓的国王副罗伊效应。从有关美国总统信息的发现中可以得出一些社会政治见解。

更新日期:2020-05-31
down
wechat
bug