Two‐level pruning based ensemble with abstained learners for concept drift in data streams,Expert Systems

当前位置： X-MOL 学术 › Expert Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Two‐level pruning based ensemble with abstained learners for concept drift in data streams
Expert Systems ( IF 3.0 ) Pub Date : 2020-12-29 , DOI: 10.1111/exsy.12661
Kanu Goel ₁ , Shalini Batra ₁

Affiliation

Mining data streams for predictive analysis is one of the most interesting topics in machine learning. With the drifting data distributions, it becomes important to build adaptive systems which are dynamic and accurate. Although ensembles are powerful in improving accuracy of incremental learning, it is crucial to maintain a set of best suitable learners in the ensemble while considering the diversity between them. By adding diversity‐based pruning to the traditional accuracy‐based pruning, this paper proposes a novel concept drift handling approach named Two‐Level Pruning based Ensemble with Abstained Learners (TLP‐EnAbLe). In this approach, deferred similarity based pruning delays the removal of under performing similar learners until it is assured that they are no longer fit for prediction. The proposed scheme retains diverse learners that are well suited for current concept. Two‐level abstaining monitors performance of learners and chooses the best set of competent learners for participating in decision making. This is an enhancement to traditional majority voting system which dynamically chooses high performing learners and abstains the ones which are not suitable for prediction. In our experiments, it has been demonstrated that TLP‐EnAbLe handles concept drift more effectively than other state‐of‐the‐art algorithms on nineteen artificially drifting and ten real‐world datasets. Further, statistical tests conducted on various drift patterns which include gradual, abrupt, recurring and their combinations prove efficiency of the proposed approach.

中文翻译：

基于两级修剪的集成体，具有弃权的学习者，可用于数据流中的概念漂移

挖掘数据流以进行预测分析是机器学习中最有趣的主题之一。随着数据分布的漂移，构建动态，准确的自适应系统变得很重要。尽管合奏可以有效地提高增量学习的准确性，但至关重要的是，要在合奏中保持一组最合适的学习者，同时考虑它们之间的差异。通过将基于分枝的修剪添加到传统的基于精度的修剪中，本文提出了一种新颖的概念漂移处理方法，称为基于两级修剪与弃权者集成（TLP-EnAbLe）。在这种方法中，基于延迟的相似性修剪会延迟去除表现不佳的相似学习者，直到确保它们不再适合预测为止。提议的方案保留了非常适合当前概念的多样化学习者。两级弃权制监督学习者的表现，并选择最合适的有能力的学习者参与决策。这是对传统多数表决系统的增强，该系统可以动态选择表现出色的学习者，并放弃不适合预测的学习者。在我们的实验中，已经证明，在19个人工漂移和10个真实数据集上，TLP-EnAbLe比其他最新算法更有效地处理了概念漂移。此外，对包括渐进式，突变式，反复式及其组合在内的各种漂移模式进行的统计测试证明了该方法的有效性。两级弃权制监督学习者的表现，并选择最合适的有能力的学习者参与决策。这是对传统多数表决系统的增强，该系统可以动态选择表现出色的学习者，并放弃不适合预测的学习者。在我们的实验中，已经证明，在19个人工漂移和10个真实数据集上，TLP-EnAbLe比其他最新算法更有效地处理了概念漂移。此外，对包括渐进式，突变式，反复式及其组合在内的各种漂移模式进行的统计测试证明了该方法的有效性。两级弃权制监督学习者的表现，并选择最合适的有能力的学习者参与决策。这是对传统多数表决系统的增强，该系统可以动态选择表现出色的学习者，并放弃不适合预测的学习者。在我们的实验中，已经证明，在19个人工漂移和10个真实数据集上，TLP-EnAbLe比其他最新算法更有效地处理了概念漂移。此外，对包括渐进式，突变式，反复式及其组合在内的各种漂移模式进行的统计测试证明了该方法的有效性。这是对传统多数表决系统的增强，该系统可以动态选择表现出色的学习者，并放弃不适合预测的学习者。在我们的实验中，已经证明，在19个人工漂移和10个真实数据集上，TLP-EnAbLe比其他最新算法更有效地处理了概念漂移。此外，对包括渐进式，突变式，反复式及其组合在内的各种漂移模式进行的统计测试证明了该方法的有效性。这是对传统多数表决系统的增强，该系统可以动态选择表现出色的学习者，并放弃不适合预测的学习者。在我们的实验中，已经证明，在19个人工漂移和10个真实数据集上，TLP-EnAbLe比其他最新算法更有效地处理了概念漂移。此外，对包括渐进式，突变式，反复式及其组合在内的各种漂移模式进行的统计测试证明了该方法的有效性。

更新日期：2020-12-29

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11