当前位置: X-MOL 学术European Journal for Philosophy of Science › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
How (not) to measure replication
European Journal for Philosophy of Science ( IF 1.5 ) Pub Date : 2021-06-03 , DOI: 10.1007/s13194-021-00377-2
Samuel C. Fletcher

The replicability crisis refers to the apparent failures to replicate both important and typical positive experimental claims in psychological science and biomedicine, failures which have gained increasing attention in the past decade. In order to provide evidence that there is a replicability crisis in the first place, scientists have developed various measures of replication that help quantify or “count” whether one study replicates another. In this nontechnical essay, I critically examine five types of replication measures used in the landmark article “Estimating the reproducibility of psychological science” (Open Science Collaboration, Science, 349, ac4716, 2015) based on the following techniques: subjective assessment, null hypothesis significance testing, comparing effect sizes, comparing the original effect size with the replication confidence interval, and meta-analysis. The first four, I argue, remain unsatisfactory for a variety of conceptual or formal reasons, even taking into account various improvements. By contrast, at least one version of the meta-analytic measure does not suffer from these problems. It differs from the others in rejecting dichotomous conclusions, the assumption that one study replicates another or not simpliciter. I defend it from other recent criticisms, concluding however that it is not a panacea for all the multifarious problems that the crisis has highlighted.



中文翻译:

如何(不)衡量复制

复制性危机是指明显的故障复制在心理科学和生物医学,这在过去的十年中已经获得了越来越多的关注失败的重要的和典型的正实验要求。为了提供证据表明摆在首位一个可复制性危机,科学家已经开发出复制的各种措施帮助量化或“计数”的一项研究是否复制另一个。在这篇非技术性文章中,我批判性地研究了具有里程碑意义的文章“估计心理科学的再现性”(开放科学合作,科学,349, ac4716, 2015) 基于以下技术:主观评估、零假设显着性检验、比较效应大小、比较原始效应大小与复制置信区间以及荟萃分析。我认为,由于各种概念上或形式上的原因,前四个仍然不能令人满意,即使考虑到各种改进。相比之下,至少一个版本的元分析测量不会遇到这些问题。它与其他研究的不同之处在于拒绝二分法的结论,即一项研究复制另一项研究或不简单化的假设。我为它辩护,不受其他近期批评的影响,但得出的结论是,它不是解决危机突出的所有各种问题的灵丹妙药。

更新日期:2021-06-03
down
wechat
bug