Performance Analysis of Predictive Association Rule Classifiers using Healthcare Datasets,IETE Technical Review

当前位置： X-MOL 学术 › IETE Tech. Rev. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Performance Analysis of Predictive Association Rule Classifiers using Healthcare Datasets
IETE Technical Review ( IF 2.5 ) Pub Date : 2020-10-14 , DOI: 10.1080/02564602.2020.1827988
M. Nandhini _{1,

2} , M. Rajalakshmi ₃ , S. N. Sivanandam ₄

Affiliation

In recent years, the use of data mining techniques has gained significance in healthcare applications. The appropriate data mining techniques extract interesting affinities/associations between patient’s signs and symptoms, thus providing reasonable decision-making for the diagnosis and prognosis of the disease. Associative Classification (AC) is a contemporary technique, which uses the Class Association Rules (CARs) to build the classification system. Classification based on Predictive Association Rule (CPAR) is one of the popular AC algorithms, which utilise FOIL's Information Gain measure to select the best attributes for the generation of CARs. This work attempts to explore a suitable attribute selection measure and an error estimate measure that can best fit into the existing CPAR algorithm to construct an efficient rule-based classifier. Thus the performance of CPAR has been analyzed by applying alternative attribute selection measures such as Gain Ratio (GR) and Mutual Information Gain (IG) instead of FOIL’s Information Gain. Moreover, two error estimate measures such as Laplace accuracy (“La”) and Likelihood ratio statistic (“Lr”) are used for rule evaluation and best k-rule selection tasks of CPAR. This work analyzes the performance of CPAR-GR and CPAR-IG with the existing CPAR algorithm in terms of classifier accuracy. From the results, it was found that the use of “GR” and “IG” within CPAR yields accuracy higher than the existing CPAR. Significant differences in the performances of CPAR, CPAR-GR, and CPAR-IG are identified and demonstrated by experiments, and statistical tests using healthcare datasets taken from the UCI machine learning repository.

中文翻译：

使用医疗保健数据集的预测关联规则分类器的性能分析

近年来，数据挖掘技术的使用在医疗保健应用中具有重要意义。适当的数据挖掘技术提取患者体征和症状之间有趣的关联/关联，从而为疾病的诊断和预后提供合理的决策。关联分类 (AC) 是一种现代技术，它使用类关联规则 (CAR) 来构建分类系统。基于预测关联规则（CPAR）的分类是流行的 AC 算法之一，它利用 FOIL 的信息增益度量来选择生成 CAR 的最佳属性。这项工作试图探索一个合适的属性选择度量和一个最适合现有 CPAR 算法的误差估计度量，以构建一个有效的基于规则的分类器。（GR）和互信息增益（IG）而不是FOIL的信息增益。此外，拉普拉斯准确度（“La”）和似然比统计量（“Lr”）两种误差估计量度被用于 CPAR 的规则评估和最佳 k 规则选择任务。这项工作在分类器准确性方面分析了 CPAR -GR和 CPAR -IG与现有 CPAR 算法的性能。从结果中发现，在 CPAR 中使用“ GR ”和“ IG ”会产生比现有 CPAR 更高的准确度。CPAR、CPAR- GR和 CPAR- IG的性能存在显着差异使用取自 UCI 机器学习存储库的医疗保健数据集通过实验和统计测试识别和证明。

更新日期：2020-10-14

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11