Enhancing data analysis: uncertainty-resistance method for handling incomplete data,Applied Intelligence

当前位置： X-MOL 学术 › Appl. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Enhancing data analysis: uncertainty-resistance method for handling incomplete data
Applied Intelligence ( IF 3.4 ) Pub Date : 2019-06-25 , DOI: 10.1007/s10489-019-01514-4
Javad Hamidzadeh , Mona Moradi

Abstract

In data analysis, incomplete data commonly occurs and can have significant effects on the conclusions that can be drawn from the data. Incomplete data cause another problem, so-called uncertainty which leads to producing unreliable results. Hence, developing effective techniques to impute these missing values is crucial. Missing or incomplete data and noise are two common sources of uncertainty. In this paper, an effective method for imputing missing values is introduced which is robust to uncertainties that are arising from incompleteness and noise. A kernel-based method for removing the noise is designed. Using the belief function theory, the class of incomplete data is determined. Finally, every missing dimension is imputed considering the mean value of the same dimension of the members belonging to the determined class. The performance has been evaluated on real-world data sets from UCI repository. The results of the experiments have been compared with state-of-the-art methods, which show the superiority of the proposed method regarding classification accuracy.

中文翻译：

增强数据分析：处理不完整数据的抗不确定性方法

摘要

在数据分析中，不完整的数据通常会发生，并且可能会对可从数据得出的结论产生重大影响。数据不完整会引起另一个问题，即所谓的不确定性，从而导致产生不可靠的结果。因此，开发有效的技术来估算这些缺失值至关重要。数据丢失或不完整以及噪声是不确定性的两个常见来源。本文介绍了一种有效的估算缺失值的方法，该方法对于因不完整和噪声引起的不确定性具有鲁棒性。设计了一种基于内核的噪声消除方法。使用信念函数理论，确定不完整数据的类别。最后，考虑到属于所确定类别的成员的相同维度的平均值，对每个缺失维度进行估算。性能已通过UCI存储库中的真实数据集进行了评估。实验结果已与最新方法进行了比较，这表明了所提方法在分类准确性方面的优越性。

更新日期：2020-01-04

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11