Explaining the unsuitability of the kappa coefficient in the assessment and comparison of the accuracy of thematic maps obtained by image classification,Remote Sensing of Environment

当前位置： X-MOL 学术 › Remote Sens. Environ. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Explaining the unsuitability of the kappa coefficient in the assessment and comparison of the accuracy of thematic maps obtained by image classification
Remote Sensing of Environment ( IF 11.1 ) Pub Date : 2020-03-01 , DOI: 10.1016/j.rse.2019.111630
Giles M. Foody

The kappa coefficient is not an index of accuracy, indeed it is not an index of overall agreement but one of agreement beyond chance. Chance agreement is, however, irrelevant in an accuracy assessment and is anyway inappropriately modelled in the calculation of a kappa coefficient for typical remote sensing applications. The magnitude of a kappa coefficient is also difficult to interpret. Values that span the full range of widely used interpretation scales, indicating a level of agreement that equates to that estimated to arise from chance alone all the way through to almost perfect agreement, can be obtained from classifications that satisfy demanding accuracy targets (e.g. for a classification with overall accuracy of 95% the range of possible values of the kappa coefficient is −0.026 to 0.900). Comparisons of kappa coefficients are particularly challenging if the classes vary in their abundance (i.e. prevalence) as the magnitude of a kappa coefficient reflects not only agreement in labelling but also properties of the populations under study. It is shown that all of the arguments put forward for the use of the kappa coefficient in accuracy assessment are flawed and/or irrelevant as they apply equally to other, sometimes easier to calculate, measures of accuracy. Calls for the kappa coefficient to be abandoned from accuracy assessments should finally be heeded and researchers are encouraged to provide a set of simple measures and associated outputs such as estimates of per-class accuracy and the confusion matrix when assessing and comparing classification accuracy.

中文翻译：

解释kappa系数在影像分类所得专题图精度评估与比较中的不适用性

kappa 系数不是准确度的指标，实际上它不是整体一致性的指标，而是一种超越偶然性的一致性。然而，机会一致性与准确性评估无关，并且在典型遥感应用的 kappa 系数计算中被不恰当地建模。kappa 系数的大小也难以解释。跨越广泛使用的所有解释尺度范围的值，表明等同于估计的一致程度，直到几乎完全一致的偶然性，可以从满足苛刻的准确度目标的分类中获得（例如，对于一个分类的总体准确率为 95%，kappa 系数的可能值范围是 -0.026 到 0.900）。如果类别的丰度（即流行率）不同，则 kappa 系数的比较特别具有挑战性，因为 kappa 系数的大小不仅反映了标签的一致性，还反映了所研究人群的特性。结果表明，在准确性评估中使用 kappa 系数提出的所有论点都是有缺陷的和/或不相关的，因为它们同样适用于其他有时更容易计算的准确性度量。最终应注意要求从准确度评估中放弃 kappa 系数的呼吁，并鼓励研究人员在评估和比较分类准确度时提供一组简单的度量和相关输出，例如每类准确度的估计和混淆矩阵。流行率），因为 kappa 系数的大小不仅反映了标签的一致性，还反映了所研究人群的特性。结果表明，在准确性评估中使用 kappa 系数提出的所有论点都是有缺陷的和/或不相关的，因为它们同样适用于其他有时更容易计算的准确性度量。最终应注意要求从准确度评估中放弃 kappa 系数的呼吁，并鼓励研究人员在评估和比较分类准确度时提供一组简单的度量和相关输出，例如每类准确度的估计和混淆矩阵。流行率），因为 kappa 系数的大小不仅反映了标签的一致性，还反映了所研究人群的特性。结果表明，在准确性评估中使用 kappa 系数提出的所有论点都是有缺陷的和/或不相关的，因为它们同样适用于其他有时更容易计算的准确性度量。最终应注意要求从准确度评估中放弃 kappa 系数的呼吁，并鼓励研究人员在评估和比较分类准确度时提供一组简单的度量和相关输出，例如每类准确度的估计和混淆矩阵。结果表明，在准确性评估中使用 kappa 系数提出的所有论点都是有缺陷的和/或不相关的，因为它们同样适用于其他有时更容易计算的准确性度量。最终应注意要求从准确度评估中放弃 kappa 系数的呼吁，并鼓励研究人员在评估和比较分类准确度时提供一组简单的度量和相关输出，例如每类准确度的估计和混淆矩阵。结果表明，在准确性评估中使用 kappa 系数提出的所有论点都是有缺陷的和/或不相关的，因为它们同样适用于其他有时更容易计算的准确性度量。最终应注意要求从准确度评估中放弃 kappa 系数的呼吁，并鼓励研究人员在评估和比较分类准确度时提供一组简单的度量和相关输出，例如每类准确度的估计和混淆矩阵。

更新日期：2020-03-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11