On Model Evaluation under Non-constant Class Imbalance,arXiv - CS - Machine Learning

当前位置： X-MOL 学术 › arXiv.cs.LG › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

On Model Evaluation under Non-constant Class Imbalance
arXiv - CS - Machine Learning Pub Date : 2020-01-15 , DOI: arxiv-2001.05571
Jan Brabec, Tom\'a\v{s} Kom\'arek, Vojt\v{e}ch Franc, Luk\'a\v{s} Machlica

Many real-world classification problems are significantly class-imbalanced to detriment of the class of interest. The standard set of proper evaluation metrics is well-known but the usual assumption is that the test dataset imbalance equals the real-world imbalance. In practice, this assumption is often broken for various reasons. The reported results are then often too optimistic and may lead to wrong conclusions about industrial impact and suitability of proposed techniques. We introduce methods focusing on evaluation under non-constant class imbalance. We show that not only the absolute values of commonly used metrics, but even the order of classifiers in relation to the evaluation metric used is affected by the change of the imbalance rate. Finally, we demonstrate that using subsampling in order to get a test dataset with class imbalance equal to the one observed in the wild is not necessary, and eventually can lead to significant errors in classifier's performance estimate.

中文翻译：

非恒定类失衡下的模型评价

许多现实世界的分类问题都存在显着的类别不平衡，从而损害了感兴趣的类别。正确评估指标的标准集是众所周知的，但通常的假设是测试数据集不平衡等于现实世界的不平衡。在实践中，这个假设经常因各种原因而被打破。报告的结果往往过于乐观，可能会导致关于所提议技术的工业影响和适用性的错误结论。我们介绍了专注于非恒定类不平衡下评估的方法。我们表明，不仅常用指标的绝对值，甚至分类器相对于所用评估指标的顺序也受到不平衡率变化的影响。最后，

更新日期：2020-04-16

点击分享查看原文

点击收藏

阅读更多本刊最新论文