当前位置: X-MOL 学术Arab. J. Sci. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Performance Analysis of Machine Learning Algorithms for Thyroid Disease
Arabian Journal for Science and Engineering ( IF 2.6 ) Pub Date : 2021-01-23 , DOI: 10.1007/s13369-020-05206-x
Hafiz Abbad Ur Rehman , Chyi-Yeu Lin , Zohaib Mushtaq , Shun-Feng Su

Thyroid disease arises from an anomalous growth of thyroid tissue at the verge of the thyroid gland. Thyroid disorderliness normally ensues when this gland releases abnormal amounts of hormones where hypothyroidism (inactive thyroid gland) and hyperthyroidism (hyperactive thyroid gland) are the two main types of thyroid disorder. This study proposes the use of efficient classifiers by using machine learning algorithms in terms of accuracy and other performance evaluation metrics to detect and diagnose thyroid disease. This research presents an extensive analysis of different classifiers which are K-nearest neighbor (KNN), Naïve Bayes, support vector machine, decision tree and logistic regression implemented with or without feature selection techniques. Thyroid data were taken from DHQ Teaching Hospital, Dera Ghazi Khan, Pakistan. Thyroid dataset was unique and different from other existing studies because it included three additional features which were pulse rate, body mass index and blood pressure. Experiment was based on three iterations; the first iteration of the experiment did not employ feature selection while the second and third were with L1-, L2-based feature selection technique. Evaluation and analysis of the experiment have been done which consisted of many factors such as accuracy, precision and receiver operating curve with area under curve. The result indicated that classifiers which involved L1-based feature selection achieved an overall higher accuracy (Naive Bayes 100%, logistic regression 100% and KNN 97.84%) compared to without feature selection and L2-based feature selection technique.



中文翻译:

甲状腺疾病机器学习算法的性能分析

甲状腺疾病是由于甲状腺边缘处甲状腺组织异常生长引起的。当该腺体释放异常量的激素时,通常会发生甲状腺功能紊乱,其中甲状腺功能减退症(非活动性甲状腺)和甲状腺功能亢进症(甲状腺功能亢进)是甲状腺疾病的两种主要类型。这项研究提出了通过使用机器学习算法在准确性和其他性能评估指标方面来检测和诊断甲状腺疾病的有效分类器。这项研究提出了对不同分类器的广泛分析,这些分类器是使用或不使用特征选择技术实现的K最近邻(KNN),朴素贝叶斯,支持向量机,决策树和逻辑回归。甲状腺数据来自巴基斯坦德拉·加兹汗(Dera Ghazi Khan)的DHQ教学医院。甲状腺数据集是独特的,并且与其他现有研究不同,因为它包括脉搏率,体重指数和血压这三个附加特征。实验基于三个迭代。实验的第一个迭代不使用特征选择,而第二个和第三个迭代使用L基于1-,L 2的特征选择技术。对实验进行了评估和分析,其中包括许多因素,例如准确性,精度和接收器工作曲线以及曲线下面积。结果表明,与不基于特征选择和基于L 2的特征选择技术相比,涉及基于L 1的特征选择的分类器总体上具有更高的准确性(朴素贝叶斯100%,逻辑回归100%和KNN 97.84%)。

更新日期:2021-01-24
down
wechat
bug