Detecting virtual concept drift of regressors without ground truth values,Data Mining and Knowledge Discovery

当前位置： X-MOL 学术 › Data Min. Knowl. Discov. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Detecting virtual concept drift of regressors without ground truth values
Data Mining and Knowledge Discovery ( IF 2.8 ) Pub Date : 2021-02-04 , DOI: 10.1007/s10618-021-00739-7
Emilia Oikarinen , Henri Tiittanen , Andreas Henelius , Kai Puolamäki

Regression analysis is a standard supervised machine learning method used to model an outcome variable in terms of a set of predictor variables. In most real-world applications the true value of the outcome variable we want to predict is unknown outside the training data, i.e., the ground truth is unknown. Phenomena such as overfitting and concept drift make it difficult to directly observe when the estimate from a model potentially is wrong. In this paper we present an efficient framework for estimating the generalization error of regression functions, applicable to any family of regression functions when the ground truth is unknown. We present a theoretical derivation of the framework and empirically evaluate its strengths and limitations. We find that it performs robustly and is useful for detecting concept drift in datasets in several real-world domains.

中文翻译：

在没有基本真值的情况下检测回归变量的虚拟概念漂移

回归分析是一种标准的受监督的机器学习方法，用于根据一组预测变量对结果变量进行建模。在大多数实际应用中，我们要预测的结果变量的真实值在训练数据之外是未知的，即，基本事实是未知的。过度拟合和概念漂移之类的现象使得很难直接观察何时模型估计值可能是错误的。在本文中，我们提供了一个有效的框架，用于估算回归函数的泛化误差，适用于当地面真理未知的任何回归函数族。我们提出了该框架的理论推导，并通过经验评估了其优势和局限性。

更新日期：2021-02-05

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11