当前位置: X-MOL 学术IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Generative Imputation and Stochastic Prediction
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 20.8 ) Pub Date : 9-7-2020 , DOI: 10.1109/tpami.2020.3022383
Mohammad Kachuee 1 , Kimmo Karkkainen 1 , Orpaz Goldstein 1 , Sajad Darabi 1 , Majid Sarrafzadeh 1
Affiliation  

In many machine learning applications, we are faced with incomplete datasets. In the literature, missing data imputation techniques have been mostly concerned with filling missing values. However, the existence of missing values is synonymous with uncertainties not only over the distribution of missing values but also over target class assignments that require careful consideration. In this paper, we propose a simple and effective method for imputing missing features and estimating the distribution of target assignments given incomplete data. In order to make imputations, we train a simple and effective generator network to generate imputations that a discriminator network is tasked to distinguish. Following this, a predictor network is trained using the imputed samples from the generator network to capture the classification uncertainties and make predictions accordingly. The proposed method is evaluated on CIFAR-10 and MNIST image datasets as well as five real-world tabular classification datasets, under different missingness rates and structures. Our experimental results show the effectiveness of the proposed method in generating imputations as well as providing estimates for the class uncertainties in a classification task when faced with missing values.

中文翻译:


生成插补和随机预测



在许多机器学习应用中,我们面临着不完整的数据集。在文献中,缺失数据插补技术主要关注填充缺失值。然而,缺失值的存在不仅意味着缺失值分布的不确定性,而且还意味着需要仔细考虑的目标类别分配的不确定性。在本文中,我们提出了一种简单有效的方法,用于在给定不完整数据的情况下估算缺失特征并估计目标分配的分布。为了进行插补,我们训练一个简单而有效的生成器网络来生成判别器网络负责区分的插补。接下来,使用来自生成器网络的估算样本来训练预测器网络,以捕获分类不确定性并相应地做出预测。该方法在 CIFAR-10 和 MNIST 图像数据集以及五个真实世界的表格分类数据集上、在不同的缺失率和结构下进行了评估。我们的实验结果表明,所提出的方法在生成插补以及在面临缺失值时为分类任务中的类不确定性提供估计方面是有效的。
更新日期:2024-08-22
down
wechat
bug