当前位置:
X-MOL 学术
›
Intell. Data Anal.
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A novel adaptive k-NN classifier for handling imbalance: Application to brain MRI
Intelligent Data Analysis ( IF 1.7 ) Pub Date : 2020-07-15 , DOI: 10.3233/ida-194647 Ritaban Kirtania , Sushmita Mitra , B. Uma Shankar
Intelligent Data Analysis ( IF 1.7 ) Pub Date : 2020-07-15 , DOI: 10.3233/ida-194647 Ritaban Kirtania , Sushmita Mitra , B. Uma Shankar
The problem of efficiently classifying imbalanced data has become one of the most challenging tasks in machine learning. Some real world examples include medical image analysis, fraud detection, fault diagnosis, and anomaly detection. Although several data-level algorithms have been developed to address imbalance, they are typically subject to some restrictions. We propose a novel variant of the k-NN family of classifiers, and name this as Density-based Adaptive-distance kNN (DAkNN). It can effectively handle data with skewed distributions and varying class-densities using the concept of adaptive distance. Comparative superiority is experimentally established over related data-level algorithms (SMOTE, ADASYN), using ten sets of two-class data, in terms of geometric mean (of the true positive and negative rates) and accuracy. Additionally, five sets of multi-class data are considered and compared with different variants of k-NN, which are currently very popular. Finally, DAkNN is successfully applied on the highly imbalanced Lower Grade Glioma (LGG) MR images, with an Average-Dice score of 0.9082 for delineating the tumor regions. The results demonstrate clear superiority over state-of-the-art algorithms.
中文翻译:
一种新颖的自适应k-NN分类器,用于处理失衡:在脑MRI中的应用
有效地对不平衡数据进行分类的问题已成为机器学习中最具挑战性的任务之一。现实世界中的一些示例包括医学图像分析,欺诈检测,故障诊断和异常检测。尽管已经开发了几种数据级算法来解决不平衡问题,但它们通常会受到一些限制。我们提出了一种k-NN分类器的新颖变体,并将其命名为基于密度的自适应距离kNN(DAkNN)。使用自适应距离的概念,它可以有效处理具有偏斜分布和变化的类密度的数据。通过使用十组两类数据,就几何均值(真实正负率)和准确性而言,相对于相关数据级别算法(SMOTE,ADASYN)通过实验确定了比较优势。另外,考虑了五组多类数据并将其与k-NN的不同变体进行比较,而后者目前非常流行。最后,DAkNN成功地应用于高度不平衡的低度胶质瘤(LGG)MR图像,其平均骰得分为0.9082,用于描述肿瘤区域。结果表明,与最先进的算法相比,它具有明显的优势。
更新日期:2020-07-22
中文翻译:
一种新颖的自适应k-NN分类器,用于处理失衡:在脑MRI中的应用
有效地对不平衡数据进行分类的问题已成为机器学习中最具挑战性的任务之一。现实世界中的一些示例包括医学图像分析,欺诈检测,故障诊断和异常检测。尽管已经开发了几种数据级算法来解决不平衡问题,但它们通常会受到一些限制。我们提出了一种k-NN分类器的新颖变体,并将其命名为基于密度的自适应距离kNN(DAkNN)。使用自适应距离的概念,它可以有效处理具有偏斜分布和变化的类密度的数据。通过使用十组两类数据,就几何均值(真实正负率)和准确性而言,相对于相关数据级别算法(SMOTE,ADASYN)通过实验确定了比较优势。另外,考虑了五组多类数据并将其与k-NN的不同变体进行比较,而后者目前非常流行。最后,DAkNN成功地应用于高度不平衡的低度胶质瘤(LGG)MR图像,其平均骰得分为0.9082,用于描述肿瘤区域。结果表明,与最先进的算法相比,它具有明显的优势。