当前位置: X-MOL 学术arXiv.cs.AI › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Monotonicity in practice of adaptive testing
arXiv - CS - Artificial Intelligence Pub Date : 2020-09-15 , DOI: arxiv-2009.06981
Martin Plajner and Ji\v{r}\'i Vomlel

In our previous work we have shown how Bayesian networks can be used for adaptive testing of student skills. Later, we have taken the advantage of monotonicity restrictions in order to learn models fitting data better. This article provides a synergy between these two phases as it evaluates Bayesian network models used for computerized adaptive testing and learned with a recently proposed monotonicity gradient algorithm. This learning method is compared with another monotone method, the isotonic regression EM algorithm. The quality of methods is empirically evaluated on a large data set of the Czech National Mathematics Exam. Besides advantages of adaptive testing approach we observed also advantageous behavior of monotonic methods, especially for small learning data set sizes. Another novelty of this work is the use of the reliability interval of the score distribution, which is used to predict student's final score and grade. In the experiments we have clearly shown we can shorten the test while keeping its reliability. We have also shown that the monotonicity increases the prediction quality with limited training data sets. The monotone model learned by the gradient method has a lower question prediction quality than unrestricted models but it is better in the main target of this application, which is the student score prediction. It is an important observation that a mere optimization of the model likelihood or the prediction accuracy do not necessarily lead to a model that describes best the student.

中文翻译:

自适应测试实践中的单调性

在我们之前的工作中,我们展示了贝叶斯网络如何用于学生技能的适应性测试。后来,我们利用单调性限制来更好地学习模型拟合数据。本文提供了这两个阶段之间的协同作用,因为它评估了用于计算机自适应测试并使用最近提出的单调梯度算法学习的贝叶斯网络模型。这种学习方法与另一种单调方法等渗回归 EM 算法进行了比较。方法的质量是在捷克国家数学考试的大型数据集上凭经验评估的。除了自适应测试方法的优势外,我们还观察到单调方法的优势行为,特别是对于小规模的学习数据集。这项工作的另一个新颖之处是使用了分数分布的可靠性区间,用于预测学生的最终分数和等级。在实验中,我们清楚地表明我们可以在保持其可靠性的同时缩短测试时间。我们还表明,在训练数据集有限的情况下,单调性提高了预测质量。梯度法学习的单调模型的问题预测质量比无限制模型低,但在本应用的主要目标,即学生分数预测方面表现更好。一个重要的观察结果是,仅仅优化模型似然或预测准确性不一定会导致模型最能描述学生。在实验中,我们清楚地表明我们可以在保持其可靠性的同时缩短测试时间。我们还表明,在训练数据集有限的情况下,单调性提高了预测质量。梯度法学习的单调模型的问题预测质量比无限制模型低,但在本应用的主要目标,即学生分数预测方面表现更好。一个重要的观察结果是,仅仅优化模型似然或预测准确性并不一定会导致模型最能描述学生。在实验中,我们清楚地表明我们可以在保持其可靠性的同时缩短测试时间。我们还表明,在训练数据集有限的情况下,单调性提高了预测质量。梯度方法学习到的单调模型的问题预测质量比无限制模型低,但在本应用的主要目标,即学生分数预测方面表现更好。一个重要的观察结果是,仅仅优化模型似然或预测准确性不一定会导致模型最能描述学生。梯度法学习的单调模型的问题预测质量比无限制模型低,但在本应用的主要目标,即学生分数预测方面表现更好。一个重要的观察结果是,仅仅优化模型似然或预测准确性不一定会导致模型最能描述学生。梯度法学习的单调模型的问题预测质量比无限制模型低,但在本应用的主要目标,即学生分数预测方面表现更好。一个重要的观察结果是,仅仅优化模型似然或预测准确性不一定会导致模型最能描述学生。
更新日期:2020-09-16
down
wechat
bug