当前位置: X-MOL 学术Am. J. Phys. Anthropol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
P > .05: The incorrect interpretation of "not significant" results is a significant problem.
American Journal of Physical Anthropology ( IF 2.6 ) Pub Date : 2020-06-22 , DOI: 10.1002/ajpa.24092
Richard J Smith 1
Affiliation  

Statistically nonsignificant (p  > .05) results from a null hypothesis significance test (NHST) are often mistakenly interpreted as evidence that the null hypothesis is true—that there is “no effect” or “no difference.” However, many of these results occur because the study had low statistical power to detect an effect. Power below 50% is common, in which case a result of no statistical significance is more likely to be incorrect than correct. The inference of “no effect” is not valid even if power is high. NHST assumes that the null hypothesis is true; p is the probability of the data under the assumption that there is no effect . A statistical test cannot confirm what it assumes. These incorrect statistical inferences could be eliminated if decisions based on p values were replaced by a biological evaluation of effect sizes and their confidence intervals. For a single study, the observed effect size is the best estimate of the population effect size, regardless of the p value. Unlike p values, confidence intervals provide information about the precision of the observed effect. In the biomedical and pharmacology literature, methods have been developed to evaluate whether effects are “equivalent,” rather than zero, as tested with NHST. These methods could be used by biological anthropologists to evaluate the presence or absence of meaningful biological effects. Most of what appears to be known about no difference or no effect between sexes, between populations, between treatments, and other circumstances in the biological anthropology literature is based on invalid statistical inference.

中文翻译:

P> .05:对“不重要”结果的错误解释是一个重要问题。

 无效假设显着性检验(NHST)的统计上无意义(p > .05)的结果经常被错误地解释为无效假设是真实的证据-“没有影响”或“没有差异”。但是,出现许多此类结果是因为该研究检测到效果的统计能力低。功率低于50%是很常见的,在这种情况下,没有统计意义的结果更有可能是不正确的,而不是正确的。即使功率高,“无效果”的推论也是无效的。NHST假设原假设是正确的;p是在没有影响的假设下数据的概率。统计测试无法确认其假设。如果将基于p值的决策替换为效应大小及其置信区间的生物学评估,则可以消除这些错误的统计推断。对于单个研究,无论p值如何,观察到的效应大小都是总体效应大小的最佳估计。不像p值,置信区间提供有关观察到的效果的精度的信息。在生物医学和药理学文献中,已经开发出方法来评估用NHST测试的效果是否“等效”,而不是零。生物人类学家可以使用这些方法来评估是否存在有意义的生物学效应。在生物学人类学文献中,大多数关于性别,人群之间,治疗之间以及其他情况之间没有差异或没有影响的已知信息都是基于无效的统计推断。
更新日期:2020-07-17
down
wechat
bug