当前位置: X-MOL 学术Inform. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
On the appropriateness of Platt scaling in classifier calibration
Information Systems ( IF 3.0 ) Pub Date : 2020-09-10 , DOI: 10.1016/j.is.2020.101641
Björn Böken

Many applications using data mining and machine learning techniques require posterior probability estimates besides often highly accurate predictions. Classifier calibration is a separate branch of machine learning that aims at transforming classifier predictions into posterior class probabilities and thus are useful additional extensions in the respective applications. Among the existing state-of-the-art classifier calibration techniques, Platt scaling (sometimes also called sigmoid or logistic calibration) actually is the only parametric one while almost all of its competing methods do not rely on parametric assumptions. Platt scaling is controversially discussed in the classifier calibration literature, despite good empirical results reported in many domains there are many authors criticizing it. Interestingly, none of these criticisms properly deal with the underlying parametric assumptions. Instead, even incorrect statements exist. Thus, the first contribution of this work is to review such criticism and to present a proof of the true parametric assumptions. In fact, these are more general and valid for different probability distributions, which is an immediate consequence. Next, the relationship between Platt scaling and a different, relatively new classifier calibration technique called beta calibration is analyzed and it is shown that these two are actually equivalent: Their only difference lies in the characteristics of the classifier whose predictions are calibrated. Thus, the proven validity of Platt scaling additionally translates directly into a proven optimality of beta calibration. Furthermore, evaluating classifier calibration techniques is a highly non-trivial problem as the true posteriors cannot be used as a reference. Hence, the existing evaluation metrics are reviewed as well because there exist relatively popular evaluation criteria that should not be used at all. Finally, the theoretical findings are supported by a simulation study.



中文翻译:

关于Platt标度在分类器校准中的适用性

使用数据挖掘和机器学习技术的许多应用程序通常需要高度准确的预测,而且还需要后验概率估计。分类器校准是机器学习的一个单独分支,旨在将分类器预测转换为后验概率,因此是各个应用程序中有用的附加扩展。在现有的最新分类器校准技术中,Platt缩放(有时也称为Sigmoid或Logistic校准)实际上是唯一的参数化方法,而几乎所有竞争方法都不依赖于参数假设。尽管在许多领域都报告了良好的经验结果,但是在分类器校准文献中有关于普拉特定标的讨论。有趣的是 这些批评都没有恰当地处理潜在的参数假设。相反,甚至存在不正确的语句。因此,这项工作的首要贡献是回顾这种批评并提出真实参数假设的证明。实际上,这些对于不同的概率分布更为通用和有效,这是直接的结果。接下来,分析了Platt标度和另一种相对较新的分类器校准技术(称为beta校准)之间的关系,结果表明这两者实际上是等效的:它们的唯一区别在于已对预测进行校准的分类器的特性。因此,Platt标定的经过验证的有效性还可以直接转化为经过验证的Beta校准的最优性。此外,评估分类器校准技术是一个非常重要的问题,因为真实的后代不能用作参考。因此,由于存在不应该使用的相对流行的评估标准,因此也对现有的评估指标进行了审查。最后,理论研究得到了仿真研究的支持。

更新日期:2020-09-10
down
wechat
bug