当前位置: X-MOL 学术Psychological Methods › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Interval estimation procedures for true scores of a test composed of polytomous items: An application of the multinomial error model.
Psychological Methods ( IF 10.929 ) Pub Date : 2020-08-27 , DOI: 10.1037/met0000338
Kyung Yong Kim 1 , Seohee Park 2 , Won-Chan Lee 2
Affiliation  

When a person takes alternative forms of the same test across replications of the testing procedure, the test taker's observed scores on the alternative forms are rarely identical. In educational and psychological measurement, inconsistencies in a test taker's scores that are irrelevant to the construct being measured are attributed to errors of measurement. Typically, errors of measurement are summarized as the standard deviation of a test taker's observed scores over replication of the same testing procedure. Assuming that errors of measurement follow a multinomial distribution (i.e., multinomial error model), the main goal of this study was to propose two interval estimation procedures, which are referred to as the score-like and Perks procedures, for true scores of a test with polytomous items. The performance of the score-like and Perks procedures was compared with that of two normal approximation procedures under the multinomial error model and a procedure based on item response theory (IRT) through simulation. In general, the score-like and Perks procedures outperformed the other three procedures when data were generated under the multinomial error theory framework and showed reasonable results when data were generated under the IRT framework. (PsycInfo Database Record (c) 2020 APA, all rights reserved).

中文翻译:

由多项式项目组成的测试真实分数的区间估计程序:多项误差模型的应用。

当一个人在重复的测试程序中采用相同测试的替代形式时,应试者在替代形式上观察到的分数很少相同。在教育和心理测量中,与被测量的结构无关的应试者分数的不一致归因于测量错误。通常,测量误差被概括为测试者观察到的分数与相同测试程序复制的标准偏差。假设测量误差遵循多项式分布(即多项误差模型),本研究的主要目标是提出两个区间估计程序,称为类分数和 Perks 程序,用于测试真实分数具有多项式项。通过模拟,将类分数和 Perks 程序的性能与多项误差模型下的两个正态近似程序和基于项目响应理论 (IRT) 的程序的性能进行了比较。总的来说,在多项误差理论框架下生成数据时,score-like 和 Perks 程序优于其他三个程序,并且在 IRT 框架下生成数据时显示出合理的结果。(PsycInfo 数据库记录 (c) 2020 APA,保留所有权利)。当在多项误差理论框架下生成数据时,score-like 和 Perks 程序优于其他三个程序,并且在 IRT 框架下生成数据时显示出合理的结果。(PsycInfo 数据库记录 (c) 2020 APA,保留所有权利)。当在多项误差理论框架下生成数据时,score-like 和 Perks 程序优于其他三个程序,并且在 IRT 框架下生成数据时显示出合理的结果。(PsycInfo 数据库记录 (c) 2020 APA,保留所有权利)。
更新日期:2020-08-27
down
wechat
bug