当前位置: X-MOL 学术IEEE Trans. Cybern. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multiscale Drift Detection Test to Enable Fast Learning in Nonstationary Environments
IEEE Transactions on Cybernetics ( IF 11.8 ) Pub Date : 2020-06-16 , DOI: 10.1109/tcyb.2020.2989213
XueSong Wang , Qi Kang , MengChu Zhou , Le Pan , Abdullah Abusorrah

A model can be easily influenced by unseen factors in nonstationary environments and fail to fit dynamic data distribution. In a classification scenario, this is known as a concept drift. For instance, the shopping preference of customers may change after they move from one city to another. Therefore, a shopping website or application should alter recommendations based on its poorer predictions of such user patterns. In this article, we propose a novel approach called the multiscale drift detection test (MDDT) that efficiently localizes abrupt drift points when feature values fluctuate, meaning that the current model needs immediate adaption. MDDT is based on a resampling scheme and a paired student $t$ -test. It applies a detection procedure on two different scales. Initially, the detection is performed on a broad scale to check if recently gathered drift indicators remain stationary. If a drift is claimed, a narrow scale detection is performed to trace the refined change time. This multiscale structure reduces the massive time of constantly checking and filters noises in drift indicators. Experiments are performed to compare the proposed method with several algorithms via synthetic and real-world datasets. The results indicate that it outperforms others when abrupt shift datasets are handled, and achieves the highest recall score in localizing drift points.

中文翻译:

在非平稳环境中实现快速学习的多尺度漂移检测测试

模型很容易受到非平稳环境中看不见的因素的影响,无法拟合动态数据分布。在分类场景中,这称为概念漂移。例如,顾客从一个城市搬到另一个城市后,他们的购物偏好可能会发生变化。因此,购物网站或应用程序应根据其对此类用户模式的较差预测来更改推荐。在本文中,我们提出了一种称为多尺度漂移检测测试 (MDDT) 的新方法,该方法可在特征值波动时有效地定位突然漂移点,这意味着当前模型需要立即适应。MDDT 基于重采样方案和配对学生 $t$ -测试。它在两个不同的尺度上应用检测程序。最初,大规模执行检测以检查最近收集的漂移指标是否保持静止。如果声称存在漂移,则执行窄尺度检测以跟踪精细变化时间。这种多尺度结构减少了不断检查和过滤漂移指标中的噪音的大量时间。进行实验以通过合成和真实世界的数据集将所提出的方法与几种算法进行比较。结果表明,它在处理突变数据集时优于其他方法,并且在定位漂移点方面取得了最高的召回分数。
更新日期:2020-06-16
down
wechat
bug