当前位置: X-MOL 学术Appl. Soft Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Modulo 9 model-based learning for missing data imputation
Applied Soft Computing ( IF 8.7 ) Pub Date : 2021-02-10 , DOI: 10.1016/j.asoc.2021.107167
Alladoumbaye Ngueilbaye , Hongzhi Wang , Daouda Ahmat Mahamat , Sahalu B. Junaidu

Missing Values Management is one of the challenges faced by Data Analysts. Therefore, the creation of effective data models will be the right decision for missing data imputation. However, learning, training, and Data Analysis must be implemented through machine learning algorithms. Missing Data is a problem with no feedback or variables. This problem (missing data) can result in serious Data Analysis, which may eventually lead to erroneous conclusions. This research paper first studies how missing data can affect Machine Learning Algorithms, and decision-making based on the Data Analysis’s output. Secondly, it proposes Modulo 9 as a novel method for handling missing data problems. The proposed novel method is assessed with wide-ranging experiments compared with robust Machine Learning techniques such as Support Vector Machine (SVM) Algorithm, Linear Regression (LR), K-Nearest Neighbors (KNN), Naïve Bayes (NB), Support Vector Classifier (SVC), Linear Support Vector Classifier (LSVC), Random Forest Classifier (RFC), Decision Tree Regressor (DTR), Deletion Method, Multi-Layer Perceptron (MLP), and the Mean Value. The results show that the novel method outperforms the eleven (11) existing methods.



中文翻译:

基于Modulo 9模型的学习以弥补缺失的数据

缺失价值管理是数据分析师面临的挑战之一。因此,创建有效的数据模型将是丢失数据归因的正确决定。但是,学习,培训和数据分析必须通过机器学习算法来实现。数据丢失是没有反馈或变量的问题。这个问题(缺少数据)可能导致严重的数据分析,最终可能导致错误的结论。本研究论文首先研究丢失的数据如何影响机器学习算法以及基于数据分析输出的决策。其次,它提出了Modulo 9作为处理丢失数据问题的新方法。与健壮的机器学习技术(例如支持向量机(SVM)算法,线性回归(LR),K最近邻(KNN),朴素贝叶斯(NB),支持向量分类器(SVC),线性支持向量分类器(LSVC),随机森林分类器(RFC),决策树回归器(DTR),删除方法,多层感知器(MLP)和平均值。结果表明,该新方法优于现有的十一(11)种方法。

更新日期:2021-02-15
down
wechat
bug