当前位置: X-MOL 学术Behav. Ecol. Sociobiol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Grouped circular data in biology: advice for effectively implementing statistical procedures
Behavioral Ecology and Sociobiology ( IF 2.3 ) Pub Date : 2020-07-20 , DOI: 10.1007/s00265-020-02881-6
Lukas Landler 1 , Graeme D Ruxton 2 , E Pascal Malkemper 3
Affiliation  

The most common statistical procedure with a sample of circular data is to test the null hypothesis that points are spread uniformly around the circle without a preferred direction. An array of tests for this has been developed. However, these tests were designed for continuously distributed data, whereas often (e.g. due to limited precision of measurement techniques) collected data is aggregated into a set of discrete values (e.g. rounded to the nearest degree). This disparity can cause an uncontrolled increase in type I error rate, an effect that is particularly problematic for tests that are based on the distribution of arc lengths between adjacent points (such as the Rao spacing test). Here, we demonstrate that an easy-to-apply modification can correct this problem, and we recommend this modification when using any test, other than the Rayleigh test, of circular uniformity on aggregated data. We provide R functions for this modification for several commonly used tests. In addition, we tested the power of a recently proposed test, the Gini test. However, we concluded that it lacks sufficient increase in power to replace any of the tests already in common use. In conclusion, using any of the standard circular tests (except the Rayleigh test) without modifications on rounded/aggregated data, especially with larger sample sizes, will increase the proportion of false-positive results—but we demonstrate that a simple and general modification avoids this problem. Circular data are widespread across biological disciplines, e.g. in orientation studies or circadian rhythms. Often these data are rounded to the nearest 1–10 degrees. We have shown previously that this leads to an inflation of false-positive results when testing whether the data is significantly different from a random distribution using the Rao test. Here we present a modification that avoids this increase in false-positives for rounded data while retaining statistical power for a variety of tests. In sum, we provide comprehensive guidance on how best to test for departure from uniformity in non-continuous data.

中文翻译:

生物学中的分组循环数据:有效实施统计程序的建议

圆形数据样本最常见的统计过程是检验零假设,即点在圆形周围均匀分布,没有首选方向。已经为此开发了一系列测试。然而,这些测试是为连续分布的数据而设计的,而通常(例如由于测量技术的精度有限)收集的数据被聚合成一组离散值(例如四舍五入到最接近的度数)。这种差异会导致 I 类错误率不受控制地增加,这种影响对于基于相邻点之间弧长分布的测试(例如 Rao 间距测试)来说尤其成问题。在这里,我们证明了一个易于应用的修改可以纠正这个问题,我们建议在使用任何测试时进行这个修改,除了瑞利测试,聚合数据的圆形均匀性。我们为几个常用测试提供了此修改的 R 函数。此外,我们还测试了最近提出的测试——基尼测试——的威力。然而,我们得出的结论是,它缺乏足够的功率增加来取代任何已经普遍使用的测试。总之,使用任何标准循环检验(瑞利检验除外)而不修改舍入/汇总数据,尤其是在样本量较大的情况下,会增加假阳性结果的比例——但我们证明,简单和一般的修改可以避免这个问题。循环数据在生物学学科中广泛存在,例如在定向研究或昼夜节律中。通常这些数据会四舍五入到最接近的 1-10 度。我们之前已经表明,当使用 Rao 检验测试数据是否与随机分布显着不同时,这会导致假阳性结果的膨胀。在这里,我们提出了一种修改,可以避免舍入数据的假阳性增加,同时保留各种测试的统计能力。总之,我们就如何最好地测试非连续数据的一致性提供了全面的指导。
更新日期:2020-07-20
down
wechat
bug