Effective multiple cancer disease diagnosis frameworks for improved healthcare using machine learning
Introduction
Disease prediction systems are highly critical in its functionality as it involves finding the presence or absence of a medical condition in an individual. It relatively involves different factors, varying characteristics, multifaceted and real-world aspects [1], [2]. In recent times, there is an increasing demand for data-driven, accurate predictive models to enhance the precise identification of future events [3]. Several medical associations and patient counseling programs provide cancer screening recommendations and guidance. Consult a doctor on the different recommendations, and together you can see what is right for you depending on your cancer risk factors. Laboratory tests, such as urine and blood tests, will help the doctor detect cancer-induced anomalies. For example, predictive models with leukemia may show the unusual number or type of white blood cell in a popular blood test called the total blood count. The doctor gathers a sample of the cells in the laboratory for examination during a biopsy. A model is obtained through any means. Dependent on the form of cancer and its location, the biopsy technique is right for you. A biopsy is the way to detect cancer certainly, in most cases.
It concerns developing systems to facilitate the end-users of the application having a more interactive and user-friendly environment. In the view of medical procedures, the physician or medical expert analyses the clinical records of the individuals to diagnose the condition with their experience, otherwise domain knowledge [4], [5], [6]. Across the globe, many healthcare providers are adopting the computer-assisted diagnosis system to facilitate medical practitioners for an accurate diagnosis [7], [8]. Applications in the medical field need special attention to developing decision support systems. Clinical data contains hidden information, usually beyond human competencies and understandability [9]. Finding the pattern is difficult and raised more demand for developing new computational methodologies. In this current scenario, the data extracted from a real-time environment is highly prone to noise and erroneous information [10], [11]. The existing mechanisms are not perfectly fitted to the requirement of the current challenges. Therefore, an effective solution is indeed important to address the need to make better diagnostic systems. This paper examined new techniques to fill the gap and limitations of the existing methods. In general, the outcome of a predictive model strongly depends on the input parameters [12]. Also, most of the time, the features are more chaotic than simple factors. It is not feasible to select all the features to build the model, as it might be prone to noise, incorrect inputs. The predictive model's performance solely depends on the significant features identified for effective sample categorization [13]. A small change in the parameters affects the results on different scales.
In many cases, the data is from a real-time environment, where the chance of inconsistency is high, and the quality is often not up to the mark [14], [15], [16]. Hence, this paper aims to investigate the existing models, finding a better mechanism to improve performance. The desired objective is to find the feature subsets from all the datasets incorporated in this experiment for effective disease diagnosis. Supervised machine learning algorithms were employed to test and evaluate the system's efficacy based upon its results. The healthcare industry has long been an early adopter and has greatly benefited from technological innovations. In several health fields, computer education, including innovative medical techniques, the processing of patient data and records and chronic diseases, is currently playing a key role in computer technology. Today, machine learning helps streamline administration in hospitals, map and manage infectious conditions, and customize patient care. It may affect the productivity of hospitals and health systems and decrease care costs.
This manuscript is framed with multiple sections as follows. “Background study” section discusses various algorithms and frameworks developed as a tool for disease diagnosis from previous literature. The proposed methodology is briefed in detail in multiple sub-modules that include dataset information. The proposed feature selection method's working process follows with machine learning methods with neat sketches in the “Materials and methods” section. Next, the model validation and performance evaluation process are detailed in the “Results” section. Finally, in the “conclusion” section, the findings and their significance are portrayed with proper reports and graphical analysis. In order to find the disease in various phases, a variety of screening techniques are recommended. Medics examine the electronic medical records more carefully to identify and manage the client. In certain situations, a manual mistake or misinterpretation of the data may cause an error in diagnostics. This paper provides an effective computer-aided diagnostic method with intelligence learning models to prevent these problems. In order to boost predictive efficiency a computer dependent functional simulation is proposed. This experimental research is being performed by the University of California, Irvine repository and by evidence on breast, cervical and lung cancer. Supervised learning algorithms are used for the preparation and evaluation by the proposed method of ideal features.
Section snippets
Background STUDY
In recent times, the predictive models have shown their importance in many fields that are not limited to healthcare, weather modeling, stock forecasting, intelligence, self-trajectory targeted missiles, etc. Many applications were constructed with the support of intelligence algorithms to perform critical operations from the past data. As the healthcare field is more sensitive over other relative fields, special attention becomes inevitable. In the absence of complex algorithms for decades
Dataset description
In order to find the disease in various phases, a variety of screening techniques are recommended. Medics examine the electronic medical records more carefully to identify and manage the client. In certain situations, a manual mistake or misinterpretation of the data may cause an error in diagnostics. This paper provides an effective computer-aided diagnostic method with intelligence learning models to prevent these problems. In order to boost predictive efficiency a computer dependent
Results and discussion
This experimental work is carried out in Java Framework with the support of python machine learning libraries through bridges in the Windows platform. Breast, cervical, and lung cancer datasets were used to conduct the study. In every phase of the pipeline, the datasets are processed, starting with pre-processing, where the missing values are imputed. The cleaned data is then forwarded into the next phase to find the best features from the proposed GA-CFS algorithm. This method identified five
Conclusion
The computational methods have shown prominence in the medical field and can provide profound solutions for complex systems. These systems are more beneficial for medical practitioners to make a better decision based on the models' guidelines, which are represented as knowledge captured and gathered from intelligence algorithms. This study presents an effective algorithmic model for better classification of the clinical data labeled manually by the experts. The proposed algorithm finds the
CRediT authorship contribution statement
Ching-Hsien Hsu: Conceptualization, Methodology, Software, Writing - original draft. Xing Chen: Writing - review & editing, Validation, Visualization, Investigation. Weiwei Lin: Investigation, Methodology, Validation, Supervision. Chuntao Jiang: Investigation, Validation, Supervision. Youhong Zhang: Investigation, Methodology, Software, Validation, Supervision. Zhifeng Hao: Writing - review & editing, Validation, Visualization, Investigation. Yeh-Ching Chung: Investigation, Methodology,
Declaration of Competing Interest
The authors declared that there is no conflict of interest.
Acknowledgement
This work was partially supported by the National Natural Science Foundation of China (Grant No. 61872084; 61802062; 62072187) and Guangdong-Hong Kong-Macao Intelligent Micro-Nano Optoelectronic Technology Joint Laboratory (Project No. 2020B1212030010).
References (59)
Decision support system concepts in expert systems: an empirical study
Decis. Support Syst.
(1988)- et al.
A practical approach to feature selection
(1992) - et al.
Breast cancer diagnosis using genetically optimized neural network model
Expert Syst. Appl.
(2015) - et al.
An immune-inspired semi-supervised algorithm for breast cancer diagnosis
Comput. Methods Programs Biomed.
(2016) - et al.
Principles component analysis, fuzzy weighting pre-processing and artificial immune recognition system based diagnostic system for diagnosis of lung cancer
Expert Syst. Appl.
(2008) - et al.
A hybrid classifier combining Borderline-SMOTE with AIRS algorithm for estimating brain metastasis from lung cancer: A case study in Taiwan
Comput. Methods Programs Biomed.
(2015) - et al.
A support vector machine classifier with rough set-based feature selection for breast cancer diagnosis
Expert Syst. Appl.
(2011) - et al.
Optimal discriminant plane for a small number of samples and design method of classifier on the plane
Pattern Recogn.
(1991) - et al.
Feature selection in machine learning: A new perspective
Neurocomputing
(2018) - et al.
Predicting drug responsiveness with deep learning from the effects on gene expression of Obsessive-Compulsive Disorder affected cases
Comput. Commun.
(2020)
Cognitive systems based on adaptive algorithms
Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation
Pattern Recogn.
Rationalizing medical work: decision-support techniques and medical practices
Cluster randomized trial of a multifaceted primary care decision-support intervention for inherited breast cancer risk
Fam. Pract.
Comparison of two kinds of interface, based on guided navigation or usability principles, for improving the adoption of computerized decision support systems: application to the prescription of antibiotics
J. Am. Med. Inform. Assoc.
Integrating expert systems and decision support systems
Mis Quarterly
Clinical decision-support systems
Computer programs to support clinical decision making
JAMA
Radiomic phenotyping in brain cancer to unravel hidden information in medical images
Top. Magn. Reson. Imaging
Reducing clinical noise for body mass index measures due to unit and transcription errors in the electronic health record
AMIA Summits on Translational Science Proceedings
Learning statistical models of phenotypes using noisy labeled training data
J. Am. Med. Inform. Assoc.
Input feature selection for classification problems
IEEE Trans. Neural Networks
Secondary use of EHR: data quality issues and informatics opportunities
Summit on Translational Bioinformatics
Maximizing detection of data inconsistency: The development of a consistency check interpreter
American Medical Informatics Association
Consensus methods for solving inconsistency of replicated data in distributed systems
Distributed and Parallel Databases
IOT based wearable sensor for diseases prediction and symptom analysis in healthcare sector
Peer-to-Peer Netw. Appl.
Knowledge based analysis of various statistical tools in detecting breast cancer
Computer Science & Information Technology
Naive Bayes classifiers: A probabilistic detection model for breast cancer
International Journal of Computer Applications
Cited by (34)
Carbon nanomaterials-based electrochemical aptasensor for point-of-care diagnostics of cancer biomarkers
2023, Materials Today ChemistrySignificance of machine learning in healthcare: Features, pillars and applications
2022, International Journal of Intelligent NetworksLung Cancer Prediction Using DBSMOTE and SVM
2024, Lecture Notes in Networks and SystemsA hybrid wrapper approach for optimal feature selection based on a novel multiobjective technique
2023, International Journal of System of Systems EngineeringTowards Digital Twins of 3D Reconstructed Apparel Models with an End-to-End Mobile Visualization
2023, Applied Sciences (Switzerland)