当前位置: X-MOL 学术Educational Measurement: Issues and Practice › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
On the Cover: Is my Test Speeded?
Educational Measurement: Issues and Practice ( IF 2.7 ) Pub Date : 2020-12-06 , DOI: 10.1111/emip.12402
Zhongmin Cui

“Is my test speeded?” is a question that test developers should ask when developing a power test, a test that measures ability rather than speed. A test is not speeded if most test takers have sufficient time to answer all test questions in the time allotted. One way to check the speededness of a test is to compute the completion rate – that is, the proportion of test takers who answered the test questions at the end of the test. When the time spent on test questions is available, the second way to evaluate the speededness is to compare the time spent on the questions at the end to the time spent on the questions in the beginning.

This issue's cover features Visualizing Test Speededness with Item Response Time, Position, and Difficulty created by Meghan Sullivan from University of Kansas and Bozhidar M. Bashkov from American Board of Internal Medicine. In their own words:

The graphic shows the average response times for items according to their position in the test sequence and their difficulty (i.e., proportion of correct responses). The visual here answers the following research questions: 1) “Is there a general downward trend in mean item response times toward the end of the test, controlling for item difficulty?” and 2) “Do items become more difficult at the end of the test?” The plot shows data from multiple 60‐item modules on a high‐stakes certification exam. This graph was generated using the R package ggplot2 (Wickham, 2009), and code is available upon request. The solid black line is the grand mean response time by item position across all modules.

First, as indicated by the solid black line, it does not appear that item response times are generally decreasing toward the end of the test, indicating that examinees are not “speeding up” as they reach the time limit. Second, the colored lines, indicating the trends in average response time by position as a function of item difficulty, show that more difficult items take more time to complete than easier items, but this relationship does not depend on item position. That is, items do not become more difficult if placed at the end of the test. Overall, the graphic suggests this exam is not speeded.

Interested readers may contact Bozhidar M. Bashkov (bbashkov@abim.org) for questions on this graph or the code to create it. I would recommend readers try varying the number of categories of item difficulty when creating the graph. The comparison may look sharper with a small number of categories – for example, three.



中文翻译:

封面:我的考试加快了吗?

“我的考试加快了吗?” 这是测试开发人员在进行功率测试时应该问的一个问题,即衡量能力而不是速度的测试。如果大多数应试者有足够的时间在指定的时间内回答所有考试问题,则考试速度不会加快。检查测试速度的一种方法是计算完成率–即在测试结束时回答测试问题的应试者比例。当花在测试问题上的时间可用时,评估速度的第二种方法是将最后花在问题上的时间与开始时花在问题上的时间进行比较。

本期杂志的封面特色是由堪萨斯大学的Meghan Sullivan和美国内部医学委员会的Bozhidar M. Bashkov创建的“通过项目响应时间,位置和难度可视化测试速度”。用他们自己的话说:

该图根据项目在测试序列中的位置及其难度(即正确响应的比例)显示了项目的平均响应时间。此处的图像回答了以下研究问题:1)“在测试结束时,平均项目响应时间是否存在总体下降趋势,从而控制了项目难度?” 和2)“测试结束后,项目变得更加困难吗?” 该图显示了高风险认证考试中来自多个60个项目模块的数据。该图是使用R包ggplot2(Wickham2009)生成的,可根据要求提供代码。黑色实线是所有模块中按项目位置的平均响应时间

首先,如黑色实线所示,似乎在测试快结束时,项目响应时间通常不会减少,这表明考生在达到时间限制时并未“加快”速度。其次,彩色线表示按位置平均响应时间随物品难度变化的趋势,表明较难物品比容易物品花费更多的时间来完成,但是这种关系并不取决于物品位置。也就是说,如果放置在测试的末尾,项目不会变得更加困难。总体而言,该图表明该考试没有加速

有兴趣的读者可以联系Bozhidar M. Bashkov(bbashkov@abim.org),以获取有关此图或创建它的代码的问题。我建议读者在创建图形时尝试更改项目难度的类别数量。在少数类别(例如三个类别)下,比较可能会更加清晰。

更新日期:2020-12-23
down
wechat
bug