Efficient and High-Quality Seeded Graph Matching: Employing Higher-order Structural Information,ACM Transactions on Knowledge Discovery from Data

当前位置： X-MOL 学术 › ACM Trans. Knowl. Discov. Data › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Efficient and High-Quality Seeded Graph Matching: Employing Higher-order Structural Information
ACM Transactions on Knowledge Discovery from Data ( IF 4.0 ) Pub Date : 2021-05-03 , DOI: 10.1145/3442340
Haida Zhang ₁ , Zengfeng Huang ₂ , Xuemin Lin ₁ , Zhe Lin ₃ , Wenjie Zhang ₁ , Ying Zhang ₄

Affiliation

Driven by many real applications, we study the problem of seeded graph matching. Given two graphs

and

, and a small set

of pre-matched node pairs

where

and

, the problem is to identify a matching between

and

growing from

, such that each pair in the matching corresponds to the same underlying entity. Recent studies on efficient and effective seeded graph matching have drawn a great deal of attention and many popular methods are largely based on exploring the similarity between local structures to identify matching pairs. While these recent techniques work provably well on random graphs, their accuracy is low over many real networks. In this work, we propose to utilize higher-order neighboring information to improve the matching accuracy and efficiency. As a result, a new framework of seeded graph matching is proposed, which employs Personalized PageRank (PPR) to quantify the matching score of each node pair. To further boost the matching accuracy, we propose a novel postponing strategy, which postpones the selection of pairs that have competitors with similar matching scores. We show that the postpone strategy indeed significantly improves the matching accuracy. To improve the scalability of matching large graphs, we also propose efficient approximation techniques based on algorithms for computing PPR heavy hitters. Our comprehensive experimental studies on large-scale real datasets demonstrate that, compared with state-of-the-art approaches, our framework not only increases the precision and recall both by a significant margin but also achieves speed-up up to more than one order of magnitude.

中文翻译：

高效和高质量的种子图匹配：使用高阶结构信息

在许多实际应用的驱动下，我们研究了种子图匹配问题。给定两张图

和

, 和一个小集合

预匹配节点对

在哪里

和

, 问题是识别之间的匹配

和

从成长

，使得匹配中的每一对对应于相同的底层实体。最近关于高效和有效的种子图匹配的研究引起了极大的关注，许多流行的方法主要是基于探索局部结构之间的相似性来识别匹配对。虽然这些最近的技术在随机图上工作得很好，但在许多真实网络上它们的准确性很低。在这项工作中，我们建议利用高阶相邻信息来提高匹配精度和效率。因此，提出了一种新的种子图匹配框架，该框架采用个性化 PageRank (PPR) 来量化每个节点对的匹配分数。为了进一步提高匹配精度，我们提出了一种新的延迟策略，这会推迟选择具有相似匹配分数的竞争对手的配对。我们表明，延迟策略确实显着提高了匹配精度。为了提高匹配大图的可扩展性，我们还提出了基于计算 PPR 重击者算法的有效近似技术。我们对大规模真实数据集的综合实验研究表明，与最先进的方法相比，我们的框架不仅显着提高了精度和召回率，而且还实现了超过一个数量级的加速量级。我们还提出了基于计算 PPR 重击者算法的有效近似技术。我们对大规模真实数据集的综合实验研究表明，与最先进的方法相比，我们的框架不仅显着提高了精度和召回率，而且还实现了超过一个数量级的加速量级。我们还提出了基于计算 PPR 重击者算法的有效近似技术。我们对大规模真实数据集的综合实验研究表明，与最先进的方法相比，我们的框架不仅显着提高了精度和召回率，而且还实现了超过一个数量级的加速量级。

更新日期：2021-05-03

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11