当前位置: X-MOL 学术Behav. Res. Methods › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Permutation tests are robust and powerful at 0.5% and 5% significance levels
Behavior Research Methods ( IF 5.953 ) Pub Date : 2021-05-28 , DOI: 10.3758/s13428-021-01595-5
Kimihiro Noguchi 1 , Frank Konietschke 2, 3 , Fernando Marmolejo-Ramos 4 , Markus Pauly 5
Affiliation  

Recent replication crisis has led to a number of ad hoc suggestions to decrease the chance of making false positive findings. Among them, Johnson (Proceedings of the National Academy of Sciences, 110, 19313–19317, 2013) and Benjamin et al. (Nature Human Behaviour, 2, 6–10 2018) recommend using the significance level of α = 0.005 (0.5%) as opposed to the conventional 0.05 (5%) level. Even though their suggestion is easy to implement, it is unclear whether or not the commonly used statistical tests are robust and/or powerful at such a small significance level. Therefore, the main aim of our study is to investigate the robustness and power curve behaviors of independent (unpaired) two-sample tests for metric and ordinal data at nominal significance levels of α = 0.005 and α = 0.05. Through an extensive simulation study, it is found that the permutation versions of the Welch t-test and the Brunner-Munzel test are particularly robust and powerful while the commonly used two-sample tests which utilize t-distribution tend to be either liberal or conservative, and have peculiar power curve behaviors under skewed distributions with variance heterogeneity.



中文翻译:

置换检验在 0.5% 和 5% 显着性水平上是稳健且强大的

最近的复制危机导致了一些临时建议,以减少做出假阳性结果的机会。其中,Johnson ( Proceedings of the National Academy of Sciences , 110 , 19313–19317, 2013) 和 Benjamin 等人。( Nature Human Behavior , 2 , 6–10 2018) 建议使用显着性水平α = 0.005 (0.5 % ) 而不是传统的 0.05 (5 %) 等级。尽管他们的建议很容易实施,但目前尚不清楚常用的统计测试在如此小的显着性水平上是否稳健和/或强大。因此,我们研究的主要目的是研究在标称显着性水平α = 0.005 和α = 0.05的度量和有序数据的独立(非配对)双样本检验的稳健性和功效曲线行为。通过广泛的模拟研究,发现 Welch t检验和 Brunner-Munzel 检验的置换版本特别稳健和强大,而常用的利用t的两样本检验-分布倾向于自由或保守,并且在具有方差异质性的偏态分布下具有特殊的功率曲线行为。

更新日期:2021-05-28
down
wechat
bug