当前位置: X-MOL 学术Vis. in Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The standard setting process: validating interpretations of stakeholders
Visualization in Engineering Pub Date : 2019-02-18 , DOI: 10.1186/s40536-019-0071-8
Nele Kampa , Helene Wagner , Olaf Köller

Stakeholders’ interpretations of the findings of large-scale educational assessments can influence important decisions. In the context of educational assessment, standard-setting remains an especially critical element, because it is complex and largely unstandardized. Instruments established by means of standard-setting procedures such as proficiency levels (PL) therefore appear to be arbitrary to some degree. Owing to the significance such results take on, when they are communicated to stakeholders or the public, a thorough validation of this process seems crucial. In our study, ministry stakeholders intended to use PL established in an assessment of science abilities to obtain information about students’ strengths and weaknesses regarding science abilities in general and specifically about the extent to which they were prepared for future science studies. The aim of our study was to investigate the validity arguments regarding these two intended interpretations. Based on a university science test administered to 3641 upper secondary students (Grade 13), a panel of nine experts set four cut scores using two variations of the Angoff method, the Yes/No Angoff method (multiple choice items) and the extended Angoff method (complex multiple choice items). We carried out t-tests, repeated measures ANOVA, G-studies and regression analyses to support the procedural, internal, external, and consequential validity elements regarding the aforementioned interpretations of the cut scores. Our t-tests and G-studies showed that the intended use of the cut scores was valid regarding procedural and internal aspects of validity. These findings were called into question by the experts’ lack of confidence in the established cut scores. Regression analyses including number of lessons taught and intended and pursued science-related studies showed good external and poor consequential validity. The cut scores can be used as an indicator of 13th graders’ strengths and weaknesses in science. They should not be used as an indicator for preparedness for science university studies. Since assessment formats are continually evolving and consequently leading to more complex designs, further research needs to be conducted on the application of new standard-setting methods to meet the challenges arising from this development.

中文翻译:

标准制定过程:验证利益相关者的解释

利益相关者对大规模教育评估结果的解释可能会影响重要的决策。在教育评估的背景下,标准制定仍然是一个特别关键的要素,因为它很复杂并且很大程度上没有标准化。因此,通过标准制定程序(例如熟练程度(PL))建立的工具似乎在某种程度上是任意的。由于这种结果具有重要意义,当将结果传达给利益相关者或公众时,对该过程进行全面验证似乎至关重要。在我们的研究中 部委利益相关者打算利用在科学能力评估中建立的PL来获取有关学生在科学能力方面的优缺点的信息,尤其是关于他们为将来的科学研究准备的程度的信息。我们研究的目的是调查关于这两种预期解释的有效性论证。基于对3641名高中生进行的大学科学测试(13年级),由9位专家组成的小组使用Angoff方法的两个变体(是/否Angoff方法(多项选择项)和扩展Angoff方法)设置了四个得分。 (复杂的多项选择项)。我们进行了t检验,重复测量方差分析,G研究和回归分析,以支持程序,内部,外部,有关切分的上述解释的相应有效性要素。我们的t检验和G检验表明,就有效性的程序和内部方面而言,削减分数的预期用途是有效的。专家们对既定的切割分数缺乏信心使这些发现受到质疑。回归分析(包括所教授的课程以及打算进行的与科学相关的研究的数量)显示出良好的外部性和较差的结果效度。减分可以用作13年级学生在科学方面的优势和劣势的指标。它们不应用作理科大学学习准备的指标。由于评估格式不断发展并因此导致设计更加复杂,
更新日期:2019-02-18
down
wechat
bug