当前位置: X-MOL 学术J. Biomed. Inform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A novel fitness function in genetic programming for medical data classification
Journal of Biomedical informatics ( IF 4.5 ) Pub Date : 2020-11-14 , DOI: 10.1016/j.jbi.2020.103623
Arvind Kumar 1 , Nishant Sinha 2 , Arpit Bhardwaj 3
Affiliation  

In the last decade, machine learning (ML) techniques have been widely applied to identify different diseases. This facilitates an early diagnosis and increases the chance of survival. The majority of medical data-sets are unbalanced. Due to this, ML classification techniques give biased classification over the majority class. In this paper, a novel fitness function in Genetic Programming, for medical data classification has been proposed that handles the problem of unbalanced data. Four benchmark medical data-sets named chronic kidney disease (CKD), fertility, BUPA liver disorder, and Wisconsin diagnostic breast cancer (WDBC) have been taken from the University of California (UCI) machine learning repository. Classification is done using the proposed technique. The proposed technique achieved the best accuracy for CKD, WDBC, Fertility, and BUPA dataset as 100%, 99.12%, 85.0%, and 75.36% respectively, and the best AUC as 1.0, 0.99, 0.92, and 0.75 respectively. The result outcomes show an improvement over other GP and SVM methods that confirm the efficiency of our proposed algorithm.



中文翻译:

基因编程中用于医学数据分类的新型适应度函数

在过去的十年中,机器学习(ML)技术已被广泛应用于识别不同的疾病。这有助于早期诊断并增加生存机会。大多数医学数据集是不平衡的。因此,机器学习分类技术在多数类上给出了有偏见的分类。在本文中,提出了一种新的遗传规划适应度函数,用于医学数据分类,可以解决数据不平衡的问题。已从加利福尼亚大学(UCI)机器学习存储库中获取了四个基准医学数据集,分别称为慢性肾脏病(CKD),生育力,BUPA肝病和威斯康星州诊断性乳腺癌(WDBC)。使用提出的技术进行分类。拟议的技术在CKD,WDBC,生育力,和BUPA数据集分别为100%,99.12%,85.0%和75.36%,最佳AUC分别为1.0、0.99、0.92和0.75。结果结果表明,与其他GP和SVM方法相比,这些改进证实了我们提出的算法的效率。

更新日期:2020-11-18
down
wechat
bug