当前位置: X-MOL 学术Psychon. Bull. Rev. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Informal versus formal judgment of statistical models: The case of normality assumptions
Psychonomic Bulletin & Review ( IF 4.412 ) Pub Date : 2021-03-03 , DOI: 10.3758/s13423-021-01879-z
Anthony J Bishara 1 , Jiexiang Li 2 , Christian Conley 3
Affiliation  

Researchers sometimes use informal judgment for statistical model diagnostics and assumption checking. Informal judgment might seem more desirable than formal judgment because of a paradox: Formal hypothesis tests of assumptions appear to become less useful as sample size increases. We suggest that this paradox can be resolved by evaluating both formal and informal statistical judgment via a simplified signal detection framework. In 4 studies, we used this approach to compare informal judgments of normality diagnostic graphs (histograms, Q–Q plots, and P–P plots) to the performance of several formal tests (Shapiro–Wilk test, Kolmogorov–Smirnov test, etc.). Participants judged whether or not graphs of sample data came from a normal population (Experiments 1–2) or whether or not from a population close enough to normal for a parametric test to be more powerful than a nonparametric one (Experiments 3–4). Across all experiments, participants’ informal judgments showed lower discriminability than did formal hypothesis tests. This pattern occurred even after participants were given 400 training trials with feedback, a financial incentive, and ecologically valid distribution shapes. The discriminability advantage of formal normality tests led to slightly more powerful follow-up tests (parametric vs. nonparametric). Overall, the framework used here suggests that formal model diagnostics may be more desirable than informal ones.



中文翻译:

统计模型的非正式与正式判断:正态性假设的情况

研究人员有时会使用非正式判断进行统计模型诊断和假设检查。由于一个悖论,非正式判断似乎比正式判断更可取:随着样本量的增加,假设的正式假设检验似乎变得不那么有用了。我们建议可以通过简化的信号检测框架评估正式和非正式的统计判断来解决这个悖论。在 4 项研究中,我们使用这种方法将正态性诊断图(直方图、Q-Q 图和 P-P 图)的非正式判断与几个正式测试(Shapiro-Wilk 测试、Kolmogorov-Smirnov 测试等)的性能进行了比较。 )。参与者判断样本数据图是否来自正常人群(实验 1-2),或者是否来自足够接近正常的人群,参数测试比非参数测试更强大(实验 3-4)。在所有实验中,参与者的非正式判断显示出比正式假设检验更低的可辨别性。即使在参与者接受了 400 次带有反馈、经济激励和生态有效分布形状的训练试验之后,这种模式也会发生。形式正态性测试的可辨别性优势导致了稍微更强大的后续测试(参数与非参数)。总的来说,这里使用的框架表明正式的模型诊断可能比非正式的更可取。参与者的非正式判断显示出比正式假设检验更低的可辨别性。即使在参与者接受了 400 次带有反馈、经济激励和生态有效分布形状的训练试验之后,这种模式也会发生。形式正态性测试的可辨别性优势导致了稍微更强大的后续测试(参数与非参数)。总的来说,这里使用的框架表明正式的模型诊断可能比非正式的更可取。参与者的非正式判断显示出比正式假设检验更低的可辨别性。即使在参与者接受了 400 次带有反馈、经济激励和生态有效分布形状的训练试验之后,这种模式也会发生。形式正态性测试的可辨别性优势导致了稍微更强大的后续测试(参数与非参数)。总的来说,这里使用的框架表明正式的模型诊断可能比非正式的更可取。形式正态性测试的可辨别性优势导致了稍微更强大的后续测试(参数与非参数)。总的来说,这里使用的框架表明正式的模型诊断可能比非正式的更可取。形式正态性测试的可辨别性优势导致了稍微更强大的后续测试(参数与非参数)。总的来说,这里使用的框架表明正式的模型诊断可能比非正式的更可取。

更新日期:2021-03-04
down
wechat
bug