当前位置: X-MOL 学术Methods Ecol. Evol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Shannon entropy as a robust estimator of Zipf's Law in animal vocal communication repertoires
Methods in Ecology and Evolution ( IF 6.3 ) Pub Date : 2020-12-01 , DOI: 10.1111/2041-210x.13536
Arik Kershenbaum 1, 2 , Vlad Demartsev 3 , David E. Gammon 4 , Eli Geffen 5 , Morgan L. Gustison 6 , Amiyaal Ilany 7 , Adriano R. Lameira 8
Affiliation  

  1. Information complexity in animals is an indicator of advanced communication and an intricate socio‐ecology. Zipf's Law of least effort has been used to assess the potential information content of animal repertoires, including whether or not a particular animal communication could be ‘language‐like’. As all human languages follow Zipf's law, with a power law coefficient (PLC) close to −1, animal signals with similar probability distributions are postulated to possess similar information characteristics to language. However, estimation of the PLC from limited empirical datasets (e.g. most animal communication studies) is problematic because of biases from small sample sizes.
  2. The traditional approach to estimating Zipf's law PLC is to find the slope of a log–log rank‐frequency plot. Our alternative option uses the underlying equivalence between Shannon entropy (i.e. whether successive elements of a sequence are unpredictable, or repetitive) and PLC. Here, we test whether an entropy approach yields more robust estimates of Zipf's law PLC than the traditional approach.
  3. We examined the efficacy of the entropy approach in two ways. First, we estimated the PLC from synthetic datasets generated with a priori known power law probability distributions. This revealed that the estimated PLC using the traditional method is particularly inaccurate for highly stereotyped sequences, even at modest repertoire sizes. Estimation via Shannon entropy is accurate with modest sample sizes even for repertoires with thousands of distinct elements. Second, we applied these approaches to empirical data taken from 11 animal species. Shannon entropy produced a more robust estimate of PLC with lower variance than the traditional method, even when the true PLC is unknown. Our approach for the first time reveals Zipf's law operating in the vocal systems of multiple lineages: songbirds, hyraxes and cetaceans.
  4. As different methods of estimating the PLC can lead to misleading results in real data, estimating the balance of a communication system between simplicity and complexity is best performed using the entropy approach. This provides a more robust way to investigate the evolutionary constraints and processes that have acted on animal communication systems, and the parallels between these processes and the evolution of language.


中文翻译:

香农熵是动物声音交流库中齐普夫定律的可靠估计

  1. 动物的信息复杂性是高级沟通和复杂社会生态学的指标。Zipf的最小努力定律已用于评估动物库中的潜在信息内容,包括特定动物交流是否可能是“语言似的”。由于所有人类语言都遵循Zipf定律,并且幂律系数(PLC)接近-1,因此假设具有相似概率分布的动物信号具有与语言相似的信息特征。但是,由于样本量偏小,从有限的经验数据集(例如大多数动物传播研究)中对PLC进行估算是有问题的。
  2. 估算Zipf定律PLC的传统方法是找到对数-对数秩-频率图的斜率。我们的替代选择使用Shannon熵(即,序列的连续元素是不可预测的还是重复的)与PLC之间的潜在等价关系。在这里,我们测试了熵方法是否比传统方法对Zipf定律PLC产生更可靠的估计。
  3. 我们以两种方式检查了熵方法的有效性。首先,我们从具有先验已知幂律概率分布的综合数据集中估算了PLC。这表明,即使在库大小适中的情况下,使用传统方法估算的PLC对于高度定型的序列也特别不准确。通过Shannon熵进行的估计对于中等大小的样本来说是准确的,即使对于具有成千上万种不同元素的曲目而言也是如此。其次,我们将这些方法应用于从11种动物中获得的经验数据。即使在真正的PLC未知的情况下,Shannon熵也能以比传统方法低的方差产生更可靠的PLC估计。我们的方法首次揭示了齐普夫定律在多种谱系的声音系统中的作用:鸣禽,非洲蹄兔和鲸类。
  4. 由于估计PLC的不同方法可能导致对真实数据的误导,因此最好使用熵方法来估计通信系统在简单性和复杂性之间的平衡。这提供了一种更强大的方法来研究影响动物交流系统的进化限制和过程,以及这些过程与语言进化之间的相似之处。
更新日期:2020-12-01
down
wechat
bug