当前位置: X-MOL 学术Large-scale Assessments in Education › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Performance decline in low-stakes educational assessments: different mixture modeling approaches
Large-scale Assessments in Education Pub Date : 2017-11-01 , DOI: 10.1186/s40536-017-0049-3
Marit K. List , Alexander Robitzsch , Oliver Lüdtke , Olaf Köller , Gabriel Nagy

BackgroundIn low-stakes educational assessments, test takers might show a performance decline (PD) on end-of-test items. PD is a concern in educational assessments, especially when groups of students are to be compared on the proficiency variable because item responses gathered in the groups could be differently affected by PD. In order to account for PD, mixture item response theory (IRT) models have been proposed in the literature.MethodsIn this article, multigroup extensions of three existing mixture models that assess PD are compared. The models were applied to the mathematics test in a large-scale study targeting school track differences in proficiency.ResultsDespite the differences in the specification of PD, all three models showed rather similar item parameter estimates that were, however, different from the estimates given by a standard two parameter IRT model. In addition, all models indicated that the amount of PD differed between tracks, in that school track differences in proficiency were slightly reduced when PD was accounted for. Nevertheless, the models gave different estimates of the proportion of students showing PD, and differed somewhat from each other in the adjustment of proficiency scores for PD.ConclusionsMultigroup mixture models can be used to study how PD interacts with proficiency and other variables to provide a better understanding of the mechanisms behind PD. Differences between the presented models with regard to their assumptions about the relationship between PD and item responses are discussed.

中文翻译:

低水平教育评估中的绩效下降:不同的混合建模方法

背景信息在低风险教育评估中,应试者可能会在测试结束时表现出成绩下降(PD)。PD是教育评估中的一个关注点,尤其是当要比较学生组的能力变量时,因为PD中收集到的项目响应可能会受到不同的影响。为了解决PD问题,文献中提出了混合项目反应理论(IRT)模型。方法在本文中,对三种评估PD的现有混合模型的多组扩展进行了比较。这些模型被用于针对学校追踪能力差异的大规模研究中的数学测试。结果尽管PD规范存在差异,但所有三个模型均显示出相当相似的项目参数估计值,但是,与标准的两参数IRT模型给出的估算值不同。此外,所有模型都表明,各学科之间的PD量有所不同,因为考虑PD时,学校各学科在熟练程度方面的差异会略有减少。然而,这些模型对显示PD的学生比例给出了不同的估计,并且在PD熟练度得分的调整上彼此有所不同。结论多组混合模型可用于研究PD如何与熟练程度和其他变量相互作用,以提供更好的了解PD背后的机制。讨论了所提出的模型之间关于PD和项目响应之间的关系假设的差异。在该学校中,考虑PD的情况下,能力水平的差异会略有减少。然而,这些模型对显示PD的学生比例给出了不同的估计,并且在PD熟练度得分的调整上彼此有所不同。结论多组混合模型可用于研究PD如何与熟练程度和其他变量相互作用,以提供更好的了解PD背后的机制。讨论了所提出的模型之间关于PD和项目响应之间的关系假设的差异。在该学校中,考虑PD的情况下,能力水平的差异会略有减少。然而,这些模型对显示PD的学生比例给出了不同的估计,并且在PD熟练度得分的调整上彼此有所不同。结论多组混合模型可用于研究PD如何与熟练程度和其他变量相互作用,以提供更好的了解PD背后的机制。讨论了所提出的模型之间关于PD和项目响应之间的关系假设的差异。结论:多组混合模型可用于研究PD如何与熟练程度和其他变量相互作用,以更好地理解PD背后的机制。讨论了所提出的模型之间关于PD和项目响应之间的关系假设的差异。结论:多组混合模型可用于研究PD如何与熟练程度和其他变量相互作用,以更好地理解PD背后的机制。讨论了所提出的模型之间关于PD和项目响应之间的关系假设的差异。
更新日期:2017-11-01
down
wechat
bug