当前位置: X-MOL 学术Educational Measurement: Issues and Practice › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The Benefits of Fixed Item Parameter Calibration for Parameter Accuracy in Small Sample Situations in Large‐Scale Assessments
Educational Measurement: Issues and Practice ( IF 1.402 ) Pub Date : 2020-08-25 , DOI: 10.1111/emip.12381
Christoph König 1 , Lale Khorramdel 2 , Kentaro Yamamoto 2 , Andreas Frey 1
Affiliation  

Large‐scale assessments such as the Programme for International Student Assessment (PISA) have field trials where new survey features are tested for utility in the main survey. Because of resource constraints, there is a trade‐off between how much of the sample can be used to test new survey features and how much can be used for the initial item response theory (IRT) scaling. Utilizing real assessment data of the PISA 2015 Science assessment, this article demonstrates that using fixed item parameter calibration (FIPC) in the field trial yields stable item parameter estimates in the initial IRT scaling for samples as small as n = 250 per country. Moreover, the results indicate that for the recovery of the county‐specific latent trait distributions, the estimates of the trend items (i.e., the information introduced into the calibration) are crucial. Thus, concerning the country‐level sample size of n = 1,950 currently used in the PISA field trial, FIPC is useful for increasing the number of survey features that can be examined during the field trial without the need to increase the total sample size. This enables international large‐scale assessments such as PISA to keep up with state‐of‐the‐art developments regarding assessment frameworks, psychometric models, and delivery platform capabilities.

中文翻译:

固定项目参数校准对大规模评估中小样本情况下参数精度的好处

诸如国际学生评估计划(PISA)之类的大规模评估具有现场试验,其中对新的调查功能进行了测试以在主要调查中进行实用性测试。由于资源的限制,在可以用来测试新调查特征的样本量与可以用于初始项目响应理论(IRT)缩放的样本量之间要进行权衡。本文利用PISA 2015 Science评估的真实评估数据,证明了在现场试验中使用固定项目参数校准(FIPC)可以在初始IRT标度中对小至n的样本产生稳定的项目参数估计值=每个国家250个。此外,结果表明,对于恢复各县的潜在特征分布,趋势项的估计(即,引入到校准中的信息)至关重要。因此,对于PISA现场试验中当前使用的国家/地区样本数量n = 1,950,FIPC可用于增加无需增加总样本量就可以在现场试验中检查的调查特征的数量。这使PISA等国际大型评估能够跟上有关评估框架,心理计量模型和交付平台功能的最新发展。
更新日期:2020-08-25
down
wechat
bug