当前位置: X-MOL 学术Evaluation Review › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Test Frequency, Stakes, and Feedback in Student Achievement: A Meta-Analysis.
Evaluation Review ( IF 2.121 ) Pub Date : 2019-06-01 , DOI: 10.1177/0193841x19865628
Richard P Phelps 1
Affiliation  

Background: Test frequency, stakes associated with educational tests, and feedback from test results have been identified in the research literature as relevant factors in student achievement. Objectives: Summarize the separate and joint contribution to student achievement of these three treatments and their interactions via multivariable meta-analytic techniques using a database of English-language studies spanning a century (1910–2010), comprising 149 studies and 509 effect size estimates. Research design: Analysis employed robust variance estimation. Considered as potential moderators were hundreds of study features comprising various test designs and test administration, demographic, and source document characteristics. Subjects: Subjects were students at all levels, from early childhood to adult, mostly from the United States but also eight other countries. Results: We find a summary effect size of 0.84 for the three treatments collectively. Further analysis suggests benefits accrue to the incremental addition of combinations of testing and feedback or stakes and feedback. Moderator analysis shows higher effect sizes associated with the following study characteristics: more recent year of publication, summative (rather than formative) testing, constructed (rather than selected) item response formats, alignment of subject matter between pre- and posttests, and recognition/recall (rather than core subjects, art, or physical education). Conversely, lower effect sizes are associated with postsecondary students (rather than early childhood–upper secondary), special education population, larger study population, random assignment (rather than another sampling method), use of shadow test as outcome measure, designation of individuals (rather than groups) as units of analysis, and academic (rather than corporate or government) research.

中文翻译:

测试成绩,成绩和学生成绩中的反馈:荟萃分析。

背景:研究文献中已将考试频率,与教育考试相关的赌注以及来自考试结果的反馈确定为学生成绩的相关因素。目标:使用一个跨世纪(1910-2010年)的英语研究数据库,包括149项研究和509种效果估计,通过多元荟萃分析技术总结这三种疗法及其对学生成就的单独和共同的贡献。研究设计:分析采用鲁棒方差估计。数百种研究功能被认为是潜在的主持人,其中包括各种测试设计和测试管理,人口统计和原始文档特征。科目:科目是从幼儿到成人的各个层次的学生,主要来自美国,但也来自其他八个国家。结果:我们发现这三种治疗的总效应大小为0.84。进一步的分析表明,增加测试和反馈或风险承担和反馈的组合会带来好处。主持人分析显示,与以下研究特征相关的较高效应值:较新的出版年份,汇总(而非形成性)测试,构建(而非选定)项目响应格式,前后测试之间的主题对齐以及识别/回忆(而不是核心学科,艺术或体育)。相反,较低的效应量与大专学生(而不是幼儿至高中),特殊教育人群,较大的研究人群,
更新日期:2019-06-01
down
wechat
bug