Adaptive boosting in ensembles for outlier detection: Base learner selection and fusion via local domain competence,ETRI Journal

当前位置： X-MOL 学术 › ETRI J. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Adaptive boosting in ensembles for outlier detection: Base learner selection and fusion via local domain competence
ETRI Journal ( IF 1.4 ) Pub Date : 2020-03-30 , DOI: 10.4218/etrij.2019-0205
Joash Kiprotich Bii ₁ , Richard Rimiru ₁ , Ronald Waweru Mwangi ₁

Affiliation

Unusual data patterns or outliers can be generated because of human errors, incorrect measurements, or malicious activities. Detecting outliers is a difficult task that requires complex ensembles. An ideal outlier detection ensemble should consider the strengths of individual base detectors while carefully combining their outputs to create a strong overall ensemble and achieve unbiased accuracy with minimal variance. Selecting and combining the outputs of dissimilar base learners is a challenging task. This paper proposes a model that utilizes heterogeneous base learners. It adaptively boosts the outcomes of preceding learners in the first phase by assigning weights and identifying high‐performing learners based on their local domains, and then carefully fuses their outcomes in the second phase to improve overall accuracy. Experimental results from 10 benchmark datasets are used to train and test the proposed model. To investigate its accuracy in terms of separating outliers from inliers, the proposed model is tested and evaluated using accuracy metrics. The analyzed data are presented as crosstabs and percentages, followed by a descriptive method for synthesis and interpretation.

中文翻译：

集成中的自适应增强以进行异常检测：基础学习者选择和本地域能力融合

由于人为错误，不正确的测量或恶意活动，可能会生成异常的数据模式或异常值。检测异常值是一项艰巨的任务，需要复杂的集合。理想的离群值检测集合应考虑各个基本检测器的强度，同时仔细组合其输出以创建强大的整体集合，并以最小的方差实现无偏精度。选择和组合不同基础学习者的输出是一项艰巨的任务。本文提出了一个利用异构基础学习者的模型。它通过分配权重并根据其本地域来确定表现良好的学习者，从而自适应地提高第一阶段的学习者的学习成果，然后在第二阶段中仔细融合他们的学习成果，以提高整体准确性。来自10个基准数据集的实验结果用于训练和测试所提出的模型。为了研究将异常值与异常值区分开的准确性，使用准确性指标对提出的模型进行了测试和评估。所分析的数据以交叉表和百分比的形式表示，随后是用于合成和解释的描述性方法。

更新日期：2020-03-30

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>