当前位置: X-MOL 学术J. Stat. Mech. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Model selection for degree-corrected block models
Journal of Statistical Mechanics: Theory and Experiment ( IF 2.4 ) Pub Date : 2014-05-16 , DOI: 10.1088/1742-5468/2014/05/p05007
Xiaoran Yan 1 , Cosma Shalizi 2 , Jacob E Jensen 3 , Florent Krzakala 4 , Cristopher Moore 5 , Lenka Zdeborová 6 , Pan Zhang 5 , Yaojia Zhu 7
Affiliation  

The proliferation of models for networks raises challenging problems of model selection: the data are sparse and globally dependent, and models are typically high-dimensional and have large numbers of latent variables. Together, these issues mean that the usual model-selection criteria do not work properly for networks. We illustrate these challenges, and show one way to resolve them, by considering the key network-analysis problem of dividing a graph into communities or blocks of nodes with homogeneous patterns of links to the rest of the network. The standard tool for undertaking this is the stochastic block model, under which the probability of a link between two nodes is a function solely of the blocks to which they belong. This imposes a homogeneous degree distribution within each block; this can be unrealistic, so degree-corrected block models add a parameter for each node, modulating its overall degree. The choice between ordinary and degree-corrected block models matters because they make very different inferences about communities. We present the first principled and tractable approach to model selection between standard and degree-corrected block models, based on new large-graph asymptotics for the distribution of log-likelihood ratios under the stochastic block model, finding substantial departures from classical results for sparse graphs. We also develop linear-time approximations for log-likelihoods under both the stochastic block model and the degree-corrected model, using belief propagation. Applications to simulated and real networks show excellent agreement with our approximations. Our results thus both solve the practical problem of deciding on degree correction and point to a general approach to model selection in network analysis.

中文翻译:

度校正块模型的模型选择

网络模型的激增引发了模型选择的挑战性问题:数据稀疏且全局相关,模型通常是高维的并且具有大量潜在变量。总之,这些问题意味着通常的模型选择标准对网络不能正常工作。我们通过考虑将图划分为社区或节点块的关键网络分析问题来说明这些挑战,并展示一种解决这些挑战的方法,这些节点具有与网络其余部分的同构链接模式。执行此操作的标准工具是随机块模型,在该模型下,两个节点之间的链接概率仅是它们所属的块的函数。这在每个块内强加了均匀度分布;这可能不切实际,所以度校正块模型为每个节点添加一个参数,调节其整体度。普通块模型和度数校正块模型之间的选择很重要,因为它们对社区做出非常不同的推断。我们提出了第一种在标准块模型和度校正块模型之间进行模型选择的有原则且易于处理的方法,基于随机块模型下对数似然比分布的新大图渐近性,发现与稀疏图经典结果的实质性差异. 我们还使用置信传播在随机块模型和度校正模型下开发了对数似然的线性时间近似。模拟和真实网络的应用与我们的近似值非常吻合。
更新日期:2014-05-16
down
wechat
bug