M-Evolve: Structural-Mapping-Based Data Augmentation for Graph Classification,arXiv - CS - Social and Information Networks

当前位置： X-MOL 学术 › arXiv.cs.SI › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

M-Evolve: Structural-Mapping-Based Data Augmentation for Graph Classification
arXiv - CS - Social and Information Networks Pub Date : 2020-07-11 , DOI: arxiv-2007.05700
Jiajun Zhou, Jie Shen, Shanqing Yu, Guanrong Chen, Qi Xuan

Graph classification, which aims to identify the category labels of graphs, plays a significant role in drug classification, toxicity detection, protein analysis etc. However, the limitation of scale in the benchmark datasets makes it easy for graph classification models to fall into over-fitting and undergeneralization. To improve this, we introduce data augmentation on graphs (i.e. graph augmentation) and present four methods:random mapping, vertex-similarity mapping, motif-random mapping and motif-similarity mapping, to generate more weakly labeled data for small-scale benchmark datasets via heuristic transformation of graph structures. Furthermore, we propose a generic model evolution framework, named M-Evolve, which combines graph augmentation, data filtration and model retraining to optimize pre-trained graph classifiers. Experiments on six benchmark datasets demonstrate that the proposed framework helps existing graph classification models alleviate over-fitting and undergeneralization in the training on small-scale benchmark datasets, which successfully yields an average improvement of 3-13% accuracy on graph classification tasks.

中文翻译：

M-Evolve：用于图分类的基于结构映射的数据增强

图分类旨在识别图的类别标签，在药物分类、毒性检测、蛋白质分析等方面发挥着重要作用。然而，基准数据集的规模限制使得图分类模型容易陷入过度拟合和欠概括。为了改善这一点，我们在图上引入了数据增强（即图增强）并提出了四种方法：随机映射、顶点相似性映射、基序随机映射和基序相似性映射，为小规模基准数据集生成更多弱标记数据通过图结构的启发式转换。此外，我们提出了一个通用的模型进化框架，名为 M-Evolve，它结合了图增强、数据过滤和模型再训练来优化预训练的图分类器。

更新日期：2020-08-26

点击分享查看原文

点击收藏

阅读更多本刊最新论文