当前位置: X-MOL 学术Cluster Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A hybrid block-based ensemble framework for the multi-class problem to react to different types of drifts
Cluster Computing ( IF 3.6 ) Pub Date : 2021-03-29 , DOI: 10.1007/s10586-021-03267-7
Osama A. Mahdi , Eric Pardede , Nawfal Ali

Data stream mining is an important research topic that has received increasing attention due to its use in a wide range of applications, such as sensor networks, banking, and telecommunication. The phenomenon of data streams evolving over time is known as concept drift. In addition, the presence of multiple classes aggravates the problem of a loss in performance during the process of drift detection in data streams. Several drift detectors and ensemble approaches have been widely employed, however they either incur a high cost in terms of memory consumption and run time or ensemble approaches may respond slowly due to using outdated blocks to train classifiers. Motivated by this, we propose a hybrid block-based ensemble, which is a framework for multi-class classification in evolving data streams. The multi-class framework aims to integrate the main pros of an online drift detector for a k-class problem and the concept block-based weighting with a view to react to different types of drifts. The experimental evaluations on well-known synthetic and real-world datasets through a comprehensive comparison upon eleven drift detectors and five ensemble approaches, it shows that our proposed algorithms performs significantly better than other drift detectors and ensemble approaches.



中文翻译:

基于混合块的集成框架,用于解决多类问题以对不同类型的漂移做出反应

数据流挖掘是一个重要的研究主题,由于其在传感器网络,银行业务和电信等广泛的应用中的使用,因此受到越来越多的关注。数据流随时间演变的现象被称为概念漂移。另外,多个类的存在加剧了在数据流中的漂移检测过程中性能损失的问题。几种漂移检测器和集成方法已被广泛采用,但是它们要么在内存消耗和运行时间方面带来很高的成本,要么由于使用过时的块来训练分类器,因此集成方法可能响应缓慢。因此,我们提出了一种基于块的混合集成,它是在不断发展的数据流中进行多类分类的框架。多类框架旨在将在线漂移检测器的主要优点集成在一起,以解决k类问题和基于概念块的加权,从而对不同类型的漂移做出反应。通过对11个漂移检测器和5种集成方法的全面比较,对著名的合成数据集和真实世界数据集进行了实验评估,结果表明,我们提出的算法的性能明显优于其他漂移检测器和集成方法。

更新日期:2021-03-30
down
wechat
bug