当前位置: X-MOL 学术Vis. in Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Comparing different response time threshold setting methods to detect low effort on a large-scale assessment
Visualization in Engineering Pub Date : 2021-04-27 , DOI: 10.1186/s40536-021-00100-w
James Soland , Megan Kuhfeld , Joseph Rios

Low examinee effort is a major threat to valid uses of many test scores. Fortunately, several methods have been developed to detect noneffortful item responses, most of which use response times. To accurately identify noneffortful responses, one must set response time thresholds separating those responses from effortful ones. While other studies have compared the efficacy of different threshold-setting methods, they typically do so using simulated or small-scale data. When large-scale data are used in such studies, they often are not from a computer-adaptive test (CAT), use only a handful of items, or do not comprehensively examine different threshold-setting methods. In this study, we use reading test scores from over 728,923 3rd–8th-grade students in 2056 schools across the United States taking a CAT consisting of nearly 12,000 items to compare threshold-setting methods. In so doing, we help provide guidance to developers and administrators of large-scale assessments on the tradeoffs involved in using a given method to identify noneffortful responses.

中文翻译:

比较不同的响应时间阈值设置方法以检测大型评估中的低工作量

考生的努力不足是对许多考试成绩的有效使用的主要威胁。幸运的是,已经开发了几种方法来检测不满意的项目响应,其中大多数使用响应时间。为了准确识别不努力的响应,必须设置响应时间阈值,以将那些响应与努力的响应分开。尽管其他研究比较了不同阈值设置方法的功效,但他们通常使用模拟或小规模数据进行比较。当在此类研究中使用大规模数据时,它们通常不是来自计算机自适应测试(CAT),仅使用少量项目或未全面检查不同的阈值设置方法。在这项研究中,我们使用了美国2056所学校的728,923位3-8年级学生的阅读测试成绩,并采用了将近12的CAT 000个项目比较阈值设置方法。这样一来,我们将为大规模评估的开发人员和管理人员提供指导,以帮助他们权衡使用给定方法识别不满意的响应所涉及的权衡。
更新日期:2021-04-28
down
wechat
bug