Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Application of Bayesian posterior probabilistic inference in educational trials
International Journal of Research & Method in Education Pub Date : 2020-12-17 , DOI: 10.1080/1743727x.2020.1856067
Germaine Uwimpuhwe 1, 2 , Akansha Singh 1, 2 , Steve Higgins 2, 3 , Adetayo Kasim 1, 2, 3
Affiliation  

ABSTRACT

Educational researchers advocate the use of an effect size and its confidence interval to assess the effectiveness of interventions instead of relying on a p-value, which has been blamed for lack of reproducibility of research findings and the misuse of statistics. The aim of this study is to provide a framework, which can provide direct evidence of whether an intervention works for the study participants in an educational trial as the first step before generalizing evidence to the wider population. A hierarchical Bayesian model was applied to ten cluster and multisite educational trials funded by the Education Endowment Foundation in England, to estimate the effect size and associated credible intervals. The use of posterior probability is proposed as an alternative to p-values as a simple and easily interpretable metric of whether an intervention worked or not. The probability of at least one month’s progression or any other appropriate threshold is proposed to use in education outcomes instead of using a threshold of zero to determine a positive impact. The results show that the probability of at least one month’s progress ranges from 0.09 for one trial, GraphoGame Rime, to 0.94 for another, the Improving Numeracy and Literacy trial.



中文翻译:

贝叶斯后验概率推理在教育试验中的应用

摘要

教育研究人员提倡使用效应大小及其置信区间来评估干预措施的有效性,而不是依赖p值,后者被指责为缺乏研究结果的可重复性和滥用统计数据。本研究的目的是提供一个框架,作为将证据推广到更广泛人群之前的第一步,该框架可以提供直接证据,证明干预措施是否对教育试验中的研究参与者有效。将分层贝叶斯模型应用于由英国教育基金会资助的十个集群和多站点教育试验,以估计效果大小和相关的可信区间。建议使用后验概率作为p的替代方案-values 作为衡量干预是否有效的简单且易于解释的指标。建议在教育成果中使用至少一个月进展的概率或任何其他适当的阈值,而不是使用零阈值来确定积极影响。结果表明,至少一个月进步的概率范围从一项试验(GraphoGame Rime)的 0.09 到另一个试验(提高算术和识字试验)的 0.94。

更新日期:2020-12-17
down
wechat
bug