当前位置: X-MOL 学术arXiv.cs.SI › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Motif-Based Spectral Clustering of Weighted Directed Networks
arXiv - CS - Social and Information Networks Pub Date : 2020-04-02 , DOI: arxiv-2004.01293
William George Underwood, Andrew Elliott, Mihai Cucuringu

Clustering is an essential technique for network analysis, with applications in a diverse range of fields. Although spectral clustering is a popular and effective method, it fails to consider higher-order structure and can perform poorly on directed networks. One approach is to capture and cluster higher-order structures using motif adjacency matrices. However, current formulations fail to take edge weights into account, and thus are somewhat limited when weight is a key component of the network under study. We address these shortcomings by exploring motif-based weighted spectral clustering methods. We present new and computationally useful matrix formulae for motif adjacency matrices on weighted networks, which can be used to construct efficient algorithms for any anchored or non-anchored motif on three nodes. In a very sparse regime, our proposed method can handle graphs with a million nodes and tens of millions of edges. We further use our framework to construct a motif-based approach for clustering bipartite networks. We provide comprehensive experimental results, demonstrating (i) the scalability of our approach, (ii) advantages of higher-order clustering on synthetic examples, and (iii) the effectiveness of our techniques on a variety of real world data sets; and compare against several techniques from the literature. We conclude that motif-based spectral clustering is a valuable tool for analysis of directed and bipartite weighted networks, which is also scalable and easy to implement.

中文翻译:

加权有向网络的基于主题的谱聚类

聚类是网络分析的一项基本技术,在各个领域都有应用。尽管谱聚类是一种流行且有效的方法,但它没有考虑高阶结构,并且在有向网络上表现不佳。一种方法是使用模体邻接矩阵来捕获和聚类高阶结构。然而,当前的公式没有考虑边缘权重,因此当权重是所研究网络的关键组成部分时,会受到一定的限制。我们通过探索基于主题的加权谱聚类方法来解决这些缺点。我们为加权网络上的模体邻接矩阵提出了新的和计算上有用的矩阵公式,可用于为三个节点上的任何锚定或非锚定模体构建有效的算法。在非常稀疏的制度下,我们提出的方法可以处理具有一百万个节点和数千万条边的图。我们进一步使用我们的框架来构建基于模体的方法来聚类二分网络。我们提供了全面的实验结果,证明了 (i) 我们方法的可扩展性,(ii) 高阶聚类在合成示例上的优势,以及 (iii) 我们的技术在各种真实世界数据集上的有效性;并与文献中的几种技术进行比较。我们得出结论,基于模体的谱聚类是分析有向和二分加权网络的宝贵工具,它也是可扩展且易于实现的。我们提供了全面的实验结果,证明了 (i) 我们方法的可扩展性,(ii) 高阶聚类在合成示例上的优势,以及 (iii) 我们的技术在各种真实世界数据集上的有效性;并与文献中的几种技术进行比较。我们得出结论,基于模体的谱聚类是分析有向和二分加权网络的宝贵工具,它也是可扩展且易于实现的。我们提供了全面的实验结果,证明了 (i) 我们方法的可扩展性,(ii) 高阶聚类在合成示例上的优势,以及 (iii) 我们的技术在各种真实世界数据集上的有效性;并与文献中的几种技术进行比较。我们得出结论,基于模体的谱聚类是分析有向和二分加权网络的宝贵工具,它也是可扩展且易于实现的。
更新日期:2020-09-14
down
wechat
bug