当前位置: X-MOL 学术IEEE Trans. Neural Netw. Learn. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Online Active Learning for Drifting Data Streams
IEEE Transactions on Neural Networks and Learning Systems ( IF 10.2 ) Pub Date : 2021-07-21 , DOI: 10.1109/tnnls.2021.3091681
Sanmin Liu 1 , Shan Xue 2 , Jia Wu 2 , Chuan Zhou 3 , Jian Yang 2 , Zhao Li 4 , Jie Cao 5
Affiliation  

Classification methods for streaming data are not new, but very few current frameworks address all three of the most common problems with these tasks: concept drift, noise, and the exorbitant costs associated with labeling the unlabeled instances in data streams. Motivated by this gap in the field, we developed an active learning framework based on a dual-query strategy and Ebbinghaus’s law of human memory cognition. Called CogDQS, the query strategy samples only the most representative instances for manual annotation based on local density and uncertainty, thus significantly reducing the cost of labeling. The policy for discerning drift from noise and replacing outdated instances with new concepts is based on the three criteria of the Ebbinghaus forgetting curve: recall, the fading period, and the memory strength. Simulations comparing CogDQS with baselines on six different data streams containing gradual drift or abrupt drift with and without noise show that our approach produces accurate, stable models with good generalization ability at minimal labeling, storage, and computation costs.

中文翻译:


漂移数据流的在线主动学习



流数据的分类方法并不新鲜,但目前很少有框架能够解决这些任务中所有三个最常见的问题:概念漂移、噪声以及与标记数据流中未标记实例相关的高昂成本。受该领域差距的推动,我们开发了一个基于双查询策略和艾宾浩斯人类记忆认知定律的主动学习框架。该查询策略称为 CogDQS,仅对最具代表性的实例进行采样,以便根据局部密度和不确定性进行手动注释,从而显着降低标记成本。识别噪声漂移并用新概念替换过时实例的策略基于艾宾浩斯遗忘曲线的三个标准:回忆、衰退期和记忆强度。将 CogDQS 与六种不同数据流的基线进行比较的模拟表明,我们的方法以最小的标签、存储和计算成本生成了准确、稳定的模型,具有良好的泛化能力。
更新日期:2021-07-21
down
wechat
bug