当前位置: X-MOL 学术Pattern Recogn. Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
COLI: Collaborative clustering missing data imputation
Pattern Recognition Letters ( IF 3.9 ) Pub Date : 2021-11-09 , DOI: 10.1016/j.patrec.2021.11.011
Daoming Wan 1 , Roozbeh Razavi-Far 1, 2 , Mehrdad Saif 1 , Niloofar Mozafari 3
Affiliation  

Missing data imputation plays an important role in the data cleansing process. Clustering algorithms have been widely used for missing data imputation, yet, there is little research done on the use of clustering ensemble for missing data imputation, which aggregates multiple clustering results. This paper proposes a novel collaborative clustering-based imputation method, called COLI, which uses the imputation quality as a key criterion for the exchange of information between different clustering results. To the best of our knowledge, this is the first study on the impact of collaborative clustering on imputation performance. The main contributions of this paper are three-fold. A novel missing value imputation based on collaborative clustering is proposed, three amputation strategies are used to induce missingness on various complete and publicly available datasets with different mechanisms, distributions, and ratios, which allows evaluating the imputation quality of the proposed method in estimating missing values of various numerical datasets with different missingness mechanisms, distributions, and ratios. The proposed method is compared to several state-of-the-art imputation methods and attained results demonstrate that the proposed method is an effective method for handling missing data.

中文翻译:


COLI:协作聚类缺失数据插补



缺失数据插补在数据清理过程中起着重要作用。聚类算法已广泛用于缺失数据插补,然而,对于使用聚合多个聚类结果的聚类集成进行缺失数据插补的研究却很少。本文提出了一种新颖的基于协作聚类的插补方法,称为COLI,该方法使用插补质量作为不同聚类结果之间信息交换的关键标准。据我们所知,这是第一个关于协作聚类对插补性能影响的研究。本文的主要贡献有三个方面。提出了一种基于协作聚类的新型缺失值插补,使用三种截断策略在具有不同机制、分布和比率的各种完整且公开的数据集上诱导缺失,从而可以评估所提出的方法在估计缺失值时的插补质量具有不同缺失机制、分布和比率的各种数值数据集。将所提出的方法与几种最先进的插补方法进行比较,所得结果表明所提出的方法是处理缺失数据的有效方法。
更新日期:2021-11-09
down
wechat
bug