Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Investigating a new method for standardising essay marking using levels-based mark schemes
International Journal of Assessment Tools in Education Pub Date : 2019-05-15 , DOI: 10.21449/ijate.564824
Jackie GREATOREX 1 , Tom SUTCH 1 , Magda WERNO 1 , Jess BOWYER 2 , Karen DUNN 3
Affiliation  

Standardisation is a procedure used by Awarding Organisations to maximise marking reliability, by teaching examiners to consistently judge scripts using a mark scheme. However, research shows that people are better at comparing two objects than judging each object individually. Consequently, Oxford, Cambridge and RSA (OCR, a UK awarding organisation) proposed investigating a new procedure, involving ranking essays, where essay quality is judged in comparison to other essays. This study investigated the marking reliability yielded by traditional standardisation and ranking standardisation. The study entailed a marking experiment followed by examiners completing a questionnaire. In the control condition live procedures were emulated as authentically as possible within the confines of a study. The experimental condition involved ranking the quality of essays from the best to the worst and then assigning marks. After each standardisation procedure the examiners marked 50 essays from an AS History unit. All participants experienced both procedures, and marking reliability was measured. Additionally, the participants’ questionnaire responses were analysed to gain an insight into examiners’ experience. It is concluded that the Ranking Procedure is unsuitable for use in public examinations in its current form. The Traditional Procedure produced statistically significantly more reliable marking, whilst the Ranking Procedure involved a complex decision-making process. However, the Ranking Procedure produced slightly more reliable marking at the extremities of the mark range, where previous research has shown that marking tends to be less reliable.

中文翻译:

研究一种新的方法,该方法使用基于级别的评分方案来标准化论文评分

标准化是授标组织使用的一种程序,它通过教导审查员使用标记方案来一致地判断脚本来最大化标记的可靠性。但是,研究表明,与单独判断每个对象相比,人们更擅长比较两个对象。因此,牛津大学,剑桥大学和RSA(英国授奖组织OCR)提议研究一种新的程序,该程序涉及对论文进行排名,与其他论文相比,论文的质量得到了评价。本研究调查了传统标准化和等级标准化所产生的标记可靠性。该研究需要进行标记实验,然后由审查员完成问卷调查。在对照条件下,在研究范围内尽可能真实地模拟实时程序。实验条件包括从最佳到最差对论文的质量进行排名,然后分配分数。在执行每个标准化程序之后,检查员从AS历史记录单元中标记了50篇论文。所有参与者都经历了这两种程序,并测量了标记的可靠性。此外,还分析了参与者的问卷调查表,以深入了解考官的经验。结论是,排名程序不适合当前形式的公开考试。传统程序产生的统计数据明显更可靠,而排名程序涉及复杂的决策过程。但是,排名程序在标记范围的末端产生了更可靠的标记,
更新日期:2019-05-15
down
wechat
bug