当前位置: X-MOL 学术Data Min. Knowl. Discov. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Chebyshev approaches for imbalanced data streams regression models
Data Mining and Knowledge Discovery ( IF 2.8 ) Pub Date : 2021-09-20 , DOI: 10.1007/s10618-021-00793-1
Ehsan Aminian 1 , Rita P. Ribeiro 2 , João Gama 3
Affiliation  

In recent years data stream mining and learning from imbalanced data have been active research areas. Even though solutions exist to tackle these two problems, most of them are not designed to handle challenges inherited from both problems. As far as we are aware, the few approaches in the area of learning from imbalanced data streams fall in the context of classification, and no efforts on the regression domain have been reported yet. This paper proposes a technique that uses sampling strategies to cope with imbalanced data streams in a regression setting, where the most important cases have rare and extreme target values. Specifically, we employ under-sampling and over-sampling strategies that resort to Chebyshev’s inequality value as a heuristic to disclose the type of incoming cases (i.e. frequent or rare). We have evaluated our proposal by applying it in the training of models by four well-known regression algorithms over fourteen benchmark data sets. We conducted a series of experiments with different setups on both synthetic and real-world data sets. The experimental results confirm our approach’s effectiveness by showing the models’ superior performance trained by each of the sampling strategies compared with their baseline pairs.



中文翻译:

不平衡数据流回归模型的切比雪夫方法

近年来,数据流挖掘和从不平衡数据中学习一直是活跃的研究领域。尽管存在解决这两个问题的解决方案,但大多数解决方案并非旨在处理从这两个问题继承而来的挑战。据我们所知,从不平衡数据流中学习领域中的少数方法属于分类的范畴,并且还没有关于回归领域的报道。本文提出了一种技术,该技术使用采样策略来处理回归设置中的不平衡数据流,其中最重要的情况具有罕见和极端的目标值。具体而言,我们采用欠采样和过采样策略,这些策略将切比雪夫的不等式值作为启发式方法来揭示传入案例的类型(即频繁或罕见)。我们通过四个著名的回归算法在十四个基准数据集上将其应用于模型训练来评估我们的建议。我们对合成数据集和真实数据集进行了一系列不同设置的实验。实验结果通过显示每个采样策略训练的模型与其基线对相比具有卓越的性能,证实了我们方法的有效性。

更新日期:2021-09-21
down
wechat
bug