当前位置: X-MOL 学术Vis. in Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Performance decline in low-stakes educational assessments: different mixture modeling approaches
Visualization in Engineering Pub Date : 2017-11-01 , DOI: 10.1186/s40536-017-0049-3
Marit K. List , Alexander Robitzsch , Oliver Lüdtke , Olaf Köller , Gabriel Nagy

In low-stakes educational assessments, test takers might show a performance decline (PD) on end-of-test items. PD is a concern in educational assessments, especially when groups of students are to be compared on the proficiency variable because item responses gathered in the groups could be differently affected by PD. In order to account for PD, mixture item response theory (IRT) models have been proposed in the literature. In this article, multigroup extensions of three existing mixture models that assess PD are compared. The models were applied to the mathematics test in a large-scale study targeting school track differences in proficiency. Despite the differences in the specification of PD, all three models showed rather similar item parameter estimates that were, however, different from the estimates given by a standard two parameter IRT model. In addition, all models indicated that the amount of PD differed between tracks, in that school track differences in proficiency were slightly reduced when PD was accounted for. Nevertheless, the models gave different estimates of the proportion of students showing PD, and differed somewhat from each other in the adjustment of proficiency scores for PD. Multigroup mixture models can be used to study how PD interacts with proficiency and other variables to provide a better understanding of the mechanisms behind PD. Differences between the presented models with regard to their assumptions about the relationship between PD and item responses are discussed.

中文翻译:

低水平教育评估中的绩效下降:不同的混合建模方法

在低风险教育评估中,应试者可能会在测试结束项目上表现出成绩下降(PD)。PD是教育评估中的一个关注点,尤其是在比较学生组的熟练度变量时,因为PD中收集的项目响应可能会受到不同的影响。为了解决PD,文献中已经提出了混合项目响应理论(IRT)模型。在本文中,比较了评估PD的三个现有混合模型的多组扩展。该模型已针对针对学校成绩水平差异的大规模研究应用于数学测试​​。尽管PD规范存在差异,但所有三个模型都显示出相当相似的项目参数估算值,但是与标准的两参数IRT模型给出的估算值不同。此外,所有模型都表明,各学科之间的PD量有所不同,因为考虑PD时,学校各学科的熟练程度差异会略有减少。然而,这些模型对显示PD的学生比例给出了不同的估计,并且在PD熟练度分数的调整上彼此之间有些差异。多组混合模型可用于研究PD如何与熟练程度和其他变量相互作用,以更好地理解PD背后的机制。讨论了所提出的模型之间关于PD和项目响应之间的关系假设的差异。这些模型对显示PD的学生比例给出了不同的估计,并且在PD熟练度分数的调整上彼此之间也有所不同。多组混合模型可用于研究PD如何与熟练程度和其他变量相互作用,以更好地理解PD背后的机制。讨论了所提出的模型之间关于PD和项目响应之间的关系假设的差异。这些模型对显示PD的学生比例给出了不同的估计,并且在PD熟练度分数的调整上彼此之间也有所不同。多组混合模型可用于研究PD如何与熟练程度和其他变量相互作用,以更好地理解PD背后的机制。讨论了所提出的模型之间关于PD和项目响应之间的关系假设的差异。
更新日期:2017-11-01
down
wechat
bug