RADE: resource-efficient supervised anomaly detection using decision tree-based ensemble methods,Machine Learning

当前位置： X-MOL 学术 › Mach. Learn. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

RADE: resource-efficient supervised anomaly detection using decision tree-based ensemble methods
Machine Learning ( IF 7.5 ) Pub Date : 2021-09-03 , DOI: 10.1007/s10994-021-06047-x
Shay Vargaftik ₁ , Isaac Keslassy _{1,

2} , Yaniv Ben-Itzhak ₁ , Ariel Orda ₂

Affiliation

The capability to perform anomaly detection in a resource-constrained setting, such as an edge device or a loaded server, is of increasing need due to emerging on-premises computation constraints as well as security, privacy and profitability reasons. Yet, the increasing size of datasets often results in current anomaly detection methods being too resource consuming, and in particular decision-tree based ensemble classifiers. To address this need, we present RADE—a new resource-efficient anomaly detection framework that augments standard decision-tree based ensemble classifiers to perform well in a resource constrained setting. The key idea behind RADE is first to train a small model that is sufficient to correctly classify the majority of the queries. Then, using only subsets of the training data, train expert models for these fewer harder cases where the small model is at high risk of making a classification mistake. We implement RADE as a scikit-learn classifier. Our evaluation indicates that RADE offers competitive anomaly detection capabilities as compared to standard methods while significantly improving memory footprint by up to \(12\times \), training-time by up to \(20\times \), and classification time by up to \(16\times \).

中文翻译：

RADE：使用基于决策树的集成方法进行资源高效的监督异常检测

由于新出现的本地计算限制以及安全、隐私和盈利能力的原因，在资源受限的环境（例如边缘设备或加载的服务器）中执行异常检测的能力越来越需要。然而，数据集大小的增加通常导致当前的异常检测方法过于消耗资源，尤其是基于决策树的集成分类器。为了满足这一需求，我们提出了 RADE——一种新的资源高效异常检测框架，它增强了基于标准决策树的集成分类器，以在资源受限的环境中表现良好。RADE 背后的关键思想是首先训练一个足以正确分类大多数查询的小模型。然后，仅使用训练数据的子集，训练专家模型来处理这些不太困难的情况，在这些情况下，小模型很可能会犯分类错误。我们将 RADE 实施为scikit-learn分类器。我们的评估表明，与标准方法相比，RADE 提供了具有竞争力的异常检测能力，同时显着提高了高达\(12\times \) 的内存占用、高达\(20\times \) 的训练时间和高达\(20\times \) 的分类时间到\(16\times \)。

更新日期：2021-09-03

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>