当前位置: X-MOL 学术Language Testing › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A comprehensive review of Rasch measurement in language assessment: Recommendations and guidelines for research
Language Testing ( IF 2.2 ) Pub Date : 2020-07-08 , DOI: 10.1177/0265532220927487
Vahid Aryadoust 1 , Li Ying Ng 1 , Hiroki Sayama 2, 3
Affiliation  

Over the past decades, the application of Rasch measurement in language assessment has gradually increased. In the present study, we coded 215 papers using Rasch measurement published in 21 applied linguistics journals for multiple features. We found that seven Rasch models and 23 software packages were adopted in these papers, with many-facet Rasch measurement (n = 100) and Facets (n = 113) being the most frequently used Rasch model and software, respectively. Significant differences were detected between the number of papers that applied Rasch measurement to different language skills and components, with writing (n = 63) and grammar (n = 12) being the most and least frequently investigated, respectively. In addition, significant differences were found between the number of papers reporting person separation (n = 73, not reported: n = 142) and item separation (n = 59, not reported: n = 156) and those that did not. An alarming finding was how few papers reported unidimensionality check (n = 57 vs 158) and local independence (n = 19 vs 196). Finally, a multilayer network analysis revealed that research involving Rasch measurement has created two major discrete communities of practice (clusters), which can be characterized by features such as language skills, the Rasch models used, and the reporting of item reliability/separation vs person reliability/separation. Cluster 1 was accordingly labelled the production and performance cluster, whereas cluster 2 was labelled the perception and language elements cluster. Guidelines and recommendations for analyzing unidimensionality, local independence, data-to-model fit, and reliability in Rasch model analysis are proposed.

中文翻译:

语言评估中 Rasch 测量的全面回顾:研究建议和指南

在过去的几十年里,Rasch 测量在语言评估中的应用逐渐增加。在本研究中,我们使用 Rasch 测量对 215 篇论文进行了编码,这些论文发表在 21 种应用语言学期刊上,具有多种特征。我们发现这些论文中采用了 7 个 Rasch 模型和 23 个软件包,其中多面 Rasch 测量(n = 100)和 Facets(n = 113)分别是最常用的 Rasch 模型和软件。在将 Rasch 测量应用于不同语言技能和组件的论文数量之间检测到显着差异,写作(n = 63)和语法(n = 12)分别是最常和最不常被研究的。此外,报告人员分离的论文数量之间存在显着差异(n = 73,未报告:n = 142)和项目分离(n = 59,未报告:n = 156)和那些没有。一个令人震惊的发现是很少有论文报告单维检查(n = 57 vs 158)和局部独立性(n = 19 vs 196)。最后,多层网络分析表明,涉及 Rasch 测量的研究创建了两个主要的离散实践社区(集群),其特征在于语言技能、使用的 Rasch 模型以及项目可靠性/分离与人的报告等特征。可靠性/分离。集群 1 被相应地标记为生产和性能集群,而集群 2 被标记为感知和语言元素集群。提出了在 Rasch 模型分析中分析单维性、局部独立性、数据模型拟合和可靠性的指南和建议。
更新日期:2020-07-08
down
wechat
bug