Binary Classifier Calibration Using an Ensemble of Piecewise Linear Regression Models.,Knowledge and Information Systems

当前位置： X-MOL 学术 › Knowl. Inf. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Binary Classifier Calibration Using an Ensemble of Piecewise Linear Regression Models.
Knowledge and Information Systems ( IF 2.5 ) Pub Date : 2017-11-17 , DOI: 10.1007/s10115-017-1133-2
Mahdi Pakdaman Naeini _{1,

2} , Gregory F Cooper ₃

Affiliation

In this paper, we present a new nonparametric calibration method called ensemble of near-isotonic regression (ENIR). The method can be considered as an extension of BBQ (Naeini et al., in: Proceedings of twenty-ninth AAAI conference on artificial intelligence, 2015b), a recently proposed calibration method, as well as the commonly used calibration method based on isotonic regression (IsoRegC) (Zadrozny and Elkan, in: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining 2002). ENIR is designed to address the key limitation of IsoRegC which is the monotonicity assumption of the predictions. Similar to BBQ, the method post-processes the output of a binary classifier to obtain calibrated probabilities. Thus, it can be used with many existing classification models to generate accurate probabilistic predictions. We demonstrate the performance of ENIR on synthetic and real datasets for commonly applied binary classification models. Experimental results show that the method outperforms several common binary classifier calibration methods. In particular, on the real data, we evaluated ENIR commonly performs statistically significantly better than the other methods, and never worse. It is able to improve the calibration power of classifiers, while retaining their discrimination power. The method is also computationally tractable for large-scale datasets, as it is \(O(N \log N)\) time, where N is the number of samples.

中文翻译：

使用分段线性回归模型集成的二元分类器校准。

在本文中，我们提出了一种新的非参数校准方法，称为近等渗回归集成（ENIR）。该方法可以看作是BBQ的扩展（Naeini等人，在：第29届AAAI人工智能会议论文集，2015b中），最近提出的校准方法以及基于等渗回归的常用校准方法（IsoRegC）（Zadrozny和Elkan，于：2002年ACM SIGKDD国际知识发现和数据挖掘国际会议论文集）。ENIR旨在解决IsoRegC的关键局限性，即预测的单调性假设。与BBQ相似，该方法对二进制分类器的输出进行后处理以获得校准的概率。因此，它可以与许多现有分类模型一起使用，以生成准确的概率预测。我们演示了ENIR在合成和真实数据集上常用的二进制分类模型的性能。实验结果表明，该方法优于几种常用的二元分类器校准方法。特别是，在真实数据上，我们评估了ENIR在统计上通常比其他方法显着更好，并且从未恶化。它能够提高分类器的校准能力，同时保留其判别能力。该方法对于大规模数据集也具有计算上的可控性，因为它是它能够提高分类器的校准能力，同时保留其判别能力。该方法对于大规模数据集也具有计算上的可控性，因为它是它能够提高分类器的校准能力，同时保留其判别能力。该方法对于大规模数据集也具有计算上的可控性，因为它是\（O（N \ log N）\）时间，其中N是样本数。

更新日期：2017-11-17

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11