当前位置: X-MOL 学术Language Testing › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Bayesian approach to improving measurement precision over multiple test occasions
Language Testing ( IF 2.2 ) Pub Date : 2020-06-25 , DOI: 10.1177/0265532220934203
Alistair Van Moere 1 , Sean Hanlon 1
Affiliation  

In language assessment and in educational measurement more broadly, there is a tendency to interpret scores from single-administration tests as accurate indicators of a latent trait (e.g., reading ability). Even in contexts where learners receive multiple formative assessments throughout the year, estimates of student ability are determined based on the most recent assessment. This paper demonstrates a technique that incorporates prior test scores with current scores for learners who re-test periodically, in order to arrive at an estimate closer to the learners’ true score. Over 21,000 learners from two separate studies (EFL and native speaker) were tested for reading proficiency between three and five times each, over a one- to two-year period, on a multiple-choice reading test which reported reading ability in Lexile® measures. Applying Bayes theorem, prior scores and the most recent test score were combined with uncertainty parameters (i.e., measurement error) to produce new estimates of student ability. This is advantageous as prior administration data is re-used rather than discarded. The approach is recommended in the context of periodic low-stakes tests designed to measure proficiency gains over time, as well as for high-stakes tests as an alternative to allowing candidates to cherry-pick their highest score for university applications.

中文翻译:

在多个测试场合提高测量精度的贝叶斯方法

在更广泛的语言评估和教育测量中,倾向于将单次管理测试的分数解释为潜在特征(例如阅读能力)的准确指标。即使在学习者全年接受多次形成性评估的情况下,学生能力的估计也是根据最近的评估确定的。本文展示了一种技术,该技术将先前的测试分数与当前分数相结合,以便定期重新测试的学习者获得更接近于学习者真实分数的估计值。超过 21,000 名来自两项独立研究(EFL 和母语者)的学习者在一到两年的时间里接受了一项多项选择阅读测试的阅读能力测试,每次测试三到五次,该测试报告了 Lexile® 测量中的阅读能力. 应用贝叶斯定理,先前的分数和最近的测试分数与不确定性参数(即测量误差)相结合,以产生对学生能力的新估计。这是有利的,因为先前的给药数据被重新使用而不是被丢弃。建议在定期低风险测试的背景下使用该方法,该测试旨在衡量随着时间的推移熟练程度的提高,以及作为允许候选人在大学申请中挑选最高分的替代方法的高风险测试。
更新日期:2020-06-25
down
wechat
bug