当前位置: X-MOL 学术Language Testing › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Evaluating subscore uses across multiple levels: A case of reading and listening subscores for young EFL learners
Language Testing ( IF 2.400 ) Pub Date : 2019-10-14 , DOI: 10.1177/0265532219879654
Ikkyu Choi 1 , Spiros Papageorgiou 1
Affiliation  

Stakeholders of language tests are often interested in subscores. However, reporting a subscore is not always justified; a subscore should provide reliable and distinct information to be worth reporting. When a subscore is used for decisions across multiple levels (e.g., individual test takers and schools), it needs to be justified for its reliability and distinctiveness at every relevant level. In this study, we examined whether reporting seven Reading and Listening subscores of the TOEFL Primary® test, a standardized English proficiency test for young English as a foreign language learners, could be justified for reporting at individual and school levels. We analyzed data collected in pilot administrations, in which 4776 students from 51 schools participated. We employed the classical test theory (CTT) based approaches of Haberman (2008) and Haberman, Sinharay, and Puhan (2009) for the individual and school-level investigations, respectively. We also supplemented the CTT-based approaches with a factor analytic approach for the individual level analysis and a multilevel modeling approach for the school-level analysis. The results differed across the two levels: we found little support for reporting the subscores at the individual level, but strong evidence supporting the added-value of the school-level subscores when the sample size for each school exceeds 50.

中文翻译:

评估跨多个级别的子分数使用:年轻 EFL 学习者阅读和听力子分数的案例

语言测试的利益相关者通常对子分数感兴趣。但是,报告子分数并不总是合理的。子分数应提供值得报告的可靠且独特的信息。当子分数用于跨多个级别(例如,个别应试者和学校)的决策时,需要证明其在每个相关级别的可靠性和独特性。在这项研究中,我们检查了 TOEFL Primary® 考试(一项针对年轻英语作为外语学习者的标准化英语水平测试)的七项阅读和听力单项成绩是否适合在个人和学校层面进行报告。我们分析了试点管理部门收集的数据,其中有来自 51 所学校的 4776 名学生参与。我们分别采用了 Haberman (2008) 和 Haberman、Sinharay 和 Puhan (2009) 基于经典测试理论 (CTT) 的方法进行个人和学校层面的调查。我们还对基于 CTT 的方法进行了补充,其中包括用于个人层面分析的因子分析方法和用于学校层面分析的多层次建模方法。两个级别的结果不同:我们发现很少支持在个人层面报告子分数,但当每所学校的样本量超过 50 时,强有力的证据支持学校层面子分数的附加值。我们还对基于 CTT 的方法进行了补充,其中包括用于个人层面分析的因子分析方法和用于学校层面分析的多层次建模方法。两个级别的结果不同:我们发现很少支持在个人层面报告子分数,但当每所学校的样本量超过 50 时,强有力的证据支持学校层面子分数的附加值。我们还对基于 CTT 的方法进行了补充,其中包括用于个人层面分析的因子分析方法和用于学校层面分析的多层次建模方法。两个级别的结果不同:我们发现很少支持在个人层面报告子分数,但当每所学校的样本量超过 50 时,强有力的证据支持学校层面子分数的附加值。
更新日期:2019-10-14
down
wechat
bug