当前位置: X-MOL 学术arXiv.stat.ME › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Covariate-Assisted Community Detection on Sparse Networks
arXiv - STAT - Methodology Pub Date : 2022-07-30 , DOI: arxiv-2208.00257
Yaofang Hu, Wanjie Wang

Community detection is an important problem when processing network data. In many real data sets, the adjacency matrix can be too sparse at some nodes for existing methods to obtain any community information. The covariates have shown support in community detection. However, how to combine the covariates is a challenge, because covariates may have high dimensions and inconsistent class labels with the network. To quantify the relationship between the covariates and the network, we propose a general model, called covariate assisted degree corrected stochastic block model (CA- DCSBM). Based on CA-DCSBM, we design the adjusted neighbor-covariate (ANC) data matrix, which leverages covariate information to assist community detection. We then prove that the spectral clustering method on the ANC matrix will combine the network and covariates. The resulting method, named CA-SCORE, is shown to have the oracle property under mild conditions. In particular, we show that our framework can cover challenging scenarios where the adjacency matrix has no community information, or the covariate matrix has different community labels from the ones of the adjacency matrix. Finally, we apply CA-SCORE on several synthetic and real datasets and show that it has better performance than other community detection methods.

中文翻译:

稀疏网络上的协变量辅助社区检测

社区检测是处理网络数据时的一个重要问题。在许多真实数据集中,邻接矩阵在某些节点上可能过于稀疏,以至于现有方法无法获取任何社区信息。协变量在社区检测中显示出支持。然而,如何组合协变量是一个挑战,因为协变量可能具有高维度和与网络不一致的类标签。为了量化协变量和网络之间的关系,我们提出了一个通用模型,称为协变量辅助度数校正随机块模型(CA-DCSBM)。基于CA-DCSBM,我们设计了调整后的邻域协变量(ANC)数据矩阵,利用协变量信息辅助社区检测。然后我们证明了 ANC 矩阵上的谱聚类方法将结合网络和协变量。生成的名为 CA-SCORE 的方法显示在温和条件下具有预言属性。特别是,我们展示了我们的框架可以涵盖具有挑战性的场景,其中邻接矩阵没有社区信息,或者协变量矩阵具有与邻接矩阵不同的社区标签。最后,我们将 CA-SCORE 应用于几个合成和真实数据集,并表明它比其他社区检测方法具有更好的性能。
更新日期:2022-08-02
down
wechat
bug