当前位置: X-MOL 学术arXiv.cs.CE › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Handling Missing Data with Graph Representation Learning
arXiv - CS - Computational Engineering, Finance, and Science Pub Date : 2020-10-30 , DOI: arxiv-2010.16418
Jiaxuan You, Xiaobai Ma, Daisy Yi Ding, Mykel Kochenderfer, Jure Leskovec

Machine learning with missing data has been approached in two different ways, including feature imputation where missing feature values are estimated based on observed values, and label prediction where downstream labels are learned directly from incomplete data. However, existing imputation models tend to have strong prior assumptions and cannot learn from downstream tasks, while models targeting label prediction often involve heuristics and can encounter scalability issues. Here we propose GRAPE, a graph-based framework for feature imputation as well as label prediction. GRAPE tackles the missing data problem using a graph representation, where the observations and features are viewed as two types of nodes in a bipartite graph, and the observed feature values as edges. Under the GRAPE framework, the feature imputation is formulated as an edge-level prediction task and the label prediction as a node-level prediction task. These tasks are then solved with Graph Neural Networks. Experimental results on nine benchmark datasets show that GRAPE yields 20% lower mean absolute error for imputation tasks and 10% lower for label prediction tasks, compared with existing state-of-the-art methods.

中文翻译:

使用图表示学习处理缺失数据

具有缺失数据的机器学习已经以两种不同的方式进行了处理,包括特征归因(其中基于观察值估计缺失特征值)和标签预测(从下游数据直接从不完整数据中学习)。但是,现有的归因模型往往具有很强的先验假设,无法从下游任务中学习,而针对标签预测的模型通常涉及启发式算法,并且可能遇到可伸缩性问题。在这里,我们提出了GRAPE,GRAPE,一种基于图形的框架,用于特征插补和标签预测。GRAPE使用图形表示法来解决丢失的数据问题,其中观察和特征被视为二部图中的两种类型的节点,而观察到的特征值则作为边。在GRAPE框架下,特征插补被公式化为边缘级预测任务,而标签预测被公式化为节点级预测任务。然后使用Graph Neural Networks解决这些任务。在9个基准数据集上的实验结果表明,与现有的最新方法相比,GRAPE的插补任务平均绝对误差降低了20%,标签预测任务的平均绝对误差降低了10%。
更新日期:2020-11-02
down
wechat
bug