Dropout prediction model in MOOC based on clickstream data and student sample weight,Soft Computing

当前位置： X-MOL 学术 › Soft Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Dropout prediction model in MOOC based on clickstream data and student sample weight
Soft Computing ( IF 3.1 ) Pub Date : 2021-04-17 , DOI: 10.1007/s00500-021-05795-1
Cong Jin

Currently, the high dropout rate of massive open online course (MOOC) has seriously affected its popularity and promotion. How to effectively predict the dropout status of students in MOOC so as to intervene as early as possible has become a hot topic. As we know, different students in MOOC have big differences in learning behaviors, learning habits, and learning time, etc. This leads to different student samples having different effects on the prediction performance of the machine learning-based dropout prediction model (DPM). This is because the performance of machine learning-based classifiers heavily depends on the quality of training samples. To solve this problem, in this paper, a new DPM based on machine learning is proposed. Since the traditional neighborhood concept has nothing to do with the label of the sample, a new neighborhood definition, i.e., the max neighborhood, is first given. It is not only related to the distance between samples, but also related to the labels of the samples. Then, the calculation and realization algorithm of the initial weight of each student sample is studied based on the definition of the max neighborhood, which is different from the commonly methods of randomly selecting initial values. Next, the optimization method of the initial weight of the student sample is further studied using the intelligent optimization method. Finally, the classifiers trained by the weighted training samples are used as DPM. Experimental results of direct observation and statistical testing on public data sets indicate that the training sample weighting and intelligent optimization technology can significantly improve the predictive performance of DPM.

中文翻译：

基于点击流数据和学生样本权重的MOOC中的辍学预测模型

当前，大规模开放在线课程（MOOC）的高辍学率严重影响了其受欢迎程度和推广。如何有效地预测学生在MOOC中的辍学状况，并尽早进行干预已成为一个热门话题。众所周知，MOOC中的不同学生在学习行为，学习习惯和学习时间等方面存在很大差异。这导致不同的学生样本对基于机器学习的辍学预测模型（DPM）的预测性能产生不同的影响。这是因为基于机器学习的分类器的性能很大程度上取决于训练样本的质量。为了解决这个问题，本文提出了一种新的基于机器学习的DPM。由于传统的邻里概念与样品的标签无关，首先给出一个新的邻域定义，即最大邻域。它不仅与样本之间的距离有关，而且与样本的标签有关。然后，基于最大邻域的定义，研究了每个学生样本初始权重的计算和实现算法，这与随机选择初始值的常用方法不同。接下来，使用智能优化方法进一步研究学生样本的初始权重的优化方法。最后，将经过加权训练样本训练的分类器用作DPM。

更新日期：2021-04-18

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11