M-Evolve: Structural-Mapping-Based Data Augmentation for Graph Classification,IEEE Transactions on Network Science and Engineering

当前位置： X-MOL 学术 › IEEE Trans. Netw. Sci. Eng. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

M-Evolve: Structural-Mapping-Based Data Augmentation for Graph Classification
IEEE Transactions on Network Science and Engineering ( IF 6.7 ) Pub Date : 2020-10-22 , DOI: 10.1109/tnse.2020.3032950
Jiajun Zhou , Jie Shen , Shanqing Yu , Guanrong Chen , Qi Xuan

Graph classification, which aims to identify the category labels of graphs, plays a significant role in drug classification, toxicity detection, protein analysis etc. However, the limitation of scale in the benchmark datasets makes it easy for graph classification models to fall into over-fitting and undergeneralization. To improve this, we introduce data augmentation on graphs (i.e. graph augmentation) and present four methods: random mapping, vertex-similarity mapping, motif-random mapping and motif-similarity mapping, to generate more weakly labeled data for small-scale benchmark datasets via heuristic transformation of graph structures. Furthermore, we propose a generic model evolution framework, named M-Evolve, which combines graph augmentation, data filtration and model retraining to optimize pre-trained graph classifiers. Experiments on six benchmark datasets demonstrate that the proposed framework helps existing graph classification models alleviate over-fitting and undergeneralization in the training on small-scale benchmark datasets, which successfully yields an average improvement of 3-13% accuracy on graph classification tasks.

中文翻译：

M-Evolve：用于图分类的基于结构映射的数据增强

图分类旨在识别图的类别标签，在药物分类、毒性检测、蛋白质分析等方面发挥着重要作用。然而，基准数据集规模的限制使得图分类模型很容易陷入过度分类的困境。拟合和概括不足。为了改善这一点，我们在图上引入数据增强（即图增强），并提出四种方法：随机映射、顶点相似性映射、模体随机映射和模体相似性映射，为小规模基准数据集生成更多弱标记数据通过图结构的启发式变换。此外，我们提出了一个通用模型演化框架，名为 M-Evolve，它结合了图增强、数据过滤和模型再训练来优化预训练的图分类器。在六个基准数据集上的实验表明，所提出的框架有助于现有图分类模型减轻小规模基准数据集训练中的过度拟合和泛化不足，成功地将图分类任务的准确率平均提高了 3-13%。

更新日期：2020-10-22

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文