当前位置: X-MOL 学术arXiv.cs.SI › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Sampling Graphlets of Multi-layer Networks: A Restricted Random Walk Approach
arXiv - CS - Social and Information Networks Pub Date : 2020-01-20 , DOI: arxiv-2001.07136
Simiao Jiao, Zihui Xue, Xiaowei Chen, Yuedong Xu

Graphlets are induced subgraph patterns that are crucial to the understanding of the structure and function of a large network. A lot of efforts have been devoted to calculating graphlet statistics where random walk based approaches are commonly used to access restricted graphs through the available application programming interfaces (APIs). However, most of them merely consider individual networks while overlooking the strong coupling between different networks. In this paper, we estimate the graphlet concentration in multi-layer networks with real-world applications. An inter-layer edge connects two nodes in different layers if they belong to the same person. The access to a multi-layer network is restrictive in the sense that the upper layer allows random walk sampling, whereas the nodes of lower layers can be accessed only though the inter-layer edges and only support random node or edge sampling. To cope with this new challenge, we define a suit of two-layer graphlets and propose a novel random walk sampling algorithm to estimate the proportion of all the 3-node graphlets. An analytical bound on the sampling steps is proved to guarantee the convergence of our unbiased estimator. We further generalize our algorithm to explore the tradeoff between the estimated accuracies of different graphlets when the sample size is split on different layers. Experimental evaluation on real-world and synthetic multi-layer networks demonstrate the accuracy and high efficiency of our unbiased estimators.

中文翻译:

多层网络的采样 Graphlets:一种受限随机游走方法

Graphlets 是诱导子图模式,对于理解大型网络的结构和功能至关重要。许多工作致力于计算图元统计数据,其中基于随机游走的方法通常用于通过可用的应用程序编程接口 (API) 访问受限图。然而,他们大多只考虑单个网络,而忽略了不同网络之间的强耦合。在本文中,我们估计了具有实际应用的多层网络中的 Graphlet 浓度。如果属于同一个人,则层间边缘连接不同层中的两个节点。从上层允许随机游走采样的意义上说,对多层网络的访问是有限制的,而较低层的节点只能通过层间边访问,并且仅支持随机节点或边采样。为了应对这一新挑战,我们定义了一套两层图,并提出了一种新的随机游走采样算法来估计所有 3 节点图的比例。采样步骤的分析界限被证明可以保证我们的无偏估计量的收敛。我们进一步推广我们的算法,以探索当样本大小在不同层上分割时不同图元的估计精度之间的权衡。对真实世界和合成多层网络的实验评估证明了我们的无偏估计器的准确性和高效率。我们定义了一套两层图,并提出了一种新的随机游走采样算法来估计所有 3 节点图的比例。采样步骤的分析界限被证明可以保证我们的无偏估计量的收敛。我们进一步推广我们的算法,以探索当样本大小在不同层上分割时不同图元的估计精度之间的权衡。对真实世界和合成多层网络的实验评估证明了我们的无偏估计器的准确性和高效率。我们定义了一套两层图,并提出了一种新的随机游走采样算法来估计所有 3 节点图的比例。采样步骤的分析界限被证明可以保证我们的无偏估计量的收敛。我们进一步推广我们的算法,以探索当样本大小在不同层上分割时不同图元的估计精度之间的权衡。对真实世界和合成多层网络的实验评估证明了我们的无偏估计器的准确性和高效率。我们进一步推广我们的算法,以探索当样本大小在不同层上分割时不同图元的估计精度之间的权衡。对真实世界和合成多层网络的实验评估证明了我们的无偏估计器的准确性和高效率。我们进一步推广我们的算法,以探索当样本大小在不同层上分割时不同图元的估计精度之间的权衡。对真实世界和合成多层网络的实验评估证明了我们的无偏估计器的准确性和高效率。
更新日期:2020-05-12
down
wechat
bug