当前位置: X-MOL 学术WIREs Data Mining Knowl. Discov. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
No Free Lunch Theorem for concept drift detection in streaming data classification: A review
WIREs Data Mining and Knowledge Discovery ( IF 6.4 ) Pub Date : 2019-07-02 , DOI: 10.1002/widm.1327
Hanqing Hu 1 , Mehmed Kantardzic 1 , Tegjyot S. Sethi 1
Affiliation  

Many real‐world data mining applications have to deal with unlabeled streaming data. They are unlabeled because the sheer volume of the stream makes it impractical to label a significant portion of the data. The data streams can evolve over time and these changes are called concept drifts. Concept drifts have different characteristics, which can be used to categorize them into different types. A trade‐off between performance and cost exists among many concept drift detection approaches. On the one hand, high accuracy detection approach usually requires labeled data, possibly involving high cost for labeling. On the other hand, a variety of methods have been devoted to the topic of concept drift detection with unlabeled data, but these approaches often are most suited for only a subset of the concept drift types. The objective of this survey is to present these methods, categorize them and give recommendations of usage based on their behaviors under different types of concept drift.

中文翻译:

在流数据分类中没有用于概念漂移检测的免费午餐定理:回顾

许多现实世界的数据挖掘应用程序必须处理未标记的流数据。它们之所以没有标签,是因为数据流的绝对容量使其无法标记大部分数据。数据流会随着时间而发展,这些变化称为概念漂移。概念漂移具有不同的特征,可用于将其分为不同类型。许多概念漂移检测方法之间存在性能与成本之间的权衡。一方面,高精度检测方法通常需要标记的数据,可能涉及标记的高成本。另一方面,针对未标记数据的概念漂移检测主题有多种方法,但是这些方法通常最仅适用于概念漂移类型的子集。
更新日期:2019-07-02
down
wechat
bug