Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Development and validation of a rating scale for summarization as an integrated task
Asian-Pacific Journal of Second and Foreign Language Education Pub Date : 2021-07-01 , DOI: 10.1186/s40862-021-00113-6
Jiuliang Li , Qian Wang

Summary writing is essential for academic success, and has attracted renewed interest in academic research and large-scale language test. However, less attention has been paid to the development and evaluation of the scoring scales of summary writing. This study reports on the validation of a summary rubric that represented an approach to scale development with limited resources out of consideration for practicality. Participants were 83 students and three raters. Diagnostic evaluation of the scale components and categories was based on raters’ perception of their use and the scores of students’ summaries which were analyzed using multifaceted Rasch measurement (MFRM). Correlation analysis revealed significant relationships among the scoring components, but the coefficients among some of the components were over high. MFRM analysis provided evidence in support of the usefulness of the scoring rubric, but also suggested the need of a refinement of the components and categories. According to the raters, the rubric was ambiguous in addressing some crucial text features. This study has implications for summarization task design, scoring scale development and validation in particular.



中文翻译:

摘要作为一项综合任务的评分量表的开发和验证

摘要写作对于学术成功至关重要,并重新引起了学术研究和大规模语言测试的兴趣。然而,对摘要写作评分量表的制定和评价却很少受到关注。本研究报告了对概括性标题的验证,该标题代表了一种出于实用性考虑而在资源有限的情况下进行规模开发的方法。参与者是 83 名学生和 3 名评估者。量表成分和类别的诊断评估基于评分者对其使用的看法以及使用多方面 Rasch 测量 (MFRM) 进行分析的学生总结的分数。相关性分析显示评分分量之间存在显着关系,但部分分量之间的系数过高。MFRM 分析提供了支持评分标准有用性的证据,但也表明需要细化组件和类别。根据评估者的说法,该标题在处理一些关键的文本特征时模棱两可。这项研究对总结任务的设计、评分量表的开发和验证具有特别的意义。

更新日期:2021-07-01
down
wechat
bug