Skip to main content
Log in

Exact Recovery of Community Detection in k-Partite Graph Models with Applications to Learning Electric Potentials in Electric Networks

  • Published:
Journal of Statistical Physics Aims and scope Submit manuscript

Abstract

We study the vertex classification problem on a graph whose vertices are in \(k\ (k\ge 2)\) different communities, edges are only allowed between distinct communities, and the number of vertices in different communities are not necessarily equal. The observation is a weighted adjacency matrix, perturbed by a scalar multiple of the Gaussian Orthogonal Ensemble (GOE), or Gaussian Unitary Ensemble (GUE) matrix. For the exact recovery of the maximum likelihood estimation (MLE) with various weighted adjacency matrices, we prove sharp thresholds of the intensity \(\sigma \) of the Gaussian perturbation. Roughly speaking, when \(\sigma \) is below (resp. above) the threshold, exact recovery of MLE occurs with probability tending to 1 (resp. 0) as the size of the graph goes to infinity. These weighted adjacency matrices may be considered as natural models for the electric network. Surprisingly, these thresholds of \(\sigma \) do not depend on whether the sample space for MLE is restricted to such classifications that the number of vertices in each group is equal to the true value. In contrast to the \({{\mathbb {Z}}}_2\)-synchronization, a new complex version of the semi-definite programming (SDP) is designed to efficiently implement the community detection problem when the number of communities k is greater than 2, and a common region (independent of k) for \(\sigma \) such that SDP exactly recovers the true classification is obtained.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Abbe, E.: Community detection and stochastic block models: recent developments. J. Mach. Learn. Res. 18, 1–86 (2018)

    MathSciNet  MATH  Google Scholar 

  2. Abbe, E., Sandon, C.: Community detection in general stochastic block models:fundamental limits and efficient recovery algorithms. In: 2015 IEEE 56th Annual Symposium on Foundations of Computer Science, pp. 670–688 (2015)

  3. Abbe, E., Bandeira, A.S., Hall, G.: Exact recovery in the stochastic block model. IEEE Trans. Inf. Theory 62, 471–487 (2016)

    Article  MathSciNet  Google Scholar 

  4. Chen, J., Yuan, B.: Detecting functional modules in the yeast protein-protein interaction network. Bioinformatics 22, 2283–2290 (2006)

    Article  Google Scholar 

  5. Chin, P., Rao, A., Vu, V.: Stochastic block model and community detection in the sparse graphs: a spectral algorithm with optimal rate of recovery. Proc. Mach. Learn. Res. 40, 391–423 (2015)

    Google Scholar 

  6. Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. 99, 7821–7826 (2002)

    Article  ADS  MathSciNet  Google Scholar 

  7. Hajek, B., Wu, Y., Xu, J.: Achieving exact cluster recovery threshold via semidefinite programming. IEEE Trans. Inf. Theory 62, 2788–2797 (2016)

    Article  MathSciNet  Google Scholar 

  8. Holland, P., Laskey, K.B., Leinhardt, S.: Stochastic blockmodels: first steps. Soc. Netw. 5, 109–137 (1983)

    Article  MathSciNet  Google Scholar 

  9. Hsing, T., Hüsler, J., Reiss, R.-D.: The extremes of a triangular array of normal random variables. Ann. Appl. Probab. 6, 671–686 (1996)

    Article  MathSciNet  Google Scholar 

  10. Javanmard, A., Montanari, A., Ricci-Tersenghi, F.: Performance of a community detection algorithm based on semidefinite programming. J. Phys. 699, 012015 (2016)

    Google Scholar 

  11. Javanmard, A., Montanari, A., Ricci-Tersenghi, F.: Phase transitions in semidefinite relaxations. Proc. Natl. Acad. Sci. 113(16), E2218–2223 (2016)

    Article  ADS  MathSciNet  Google Scholar 

  12. Kim, C., Bandeira, A., Goemans, M.: Community detection in hypergraphs, spiked tensor models, and sum-of-squares. In: 2017 12th International Conference on Sampling Theory and Applications, pp. 124–128 (2017)

  13. Li, Z.: Exact recovery of community detection in k-community gaussian mixture models (2020)

  14. Marcotte, E.M., Pellegrini, M., Ng, H.L., Rice, D.W., Yeates, T.O., Eisenberg, D.: Detecting protein function and protein-protein interactions from genome sequences. Science 285, 751–753 (1999)

    Article  Google Scholar 

  15. Massoulié, L.: Community detection thresholds and the weak Ramanujan property. In: Proceedings of the 46th Annual ACM Symposium on Theory of Computing, pp. 694–703 (2014)

  16. Montanari, A., Sen, S.: Semidefinite programs on sparse random graphs and their application to community detection. In: Proceedings of the Forty-Eighth Annual ACM Symposium on Theory of Computing, pp. 814–827 (2016)

  17. Moore, C.: The computer science and physics of community detection: landscapes, phase transitions, and hardness (2017)

  18. Mossel, E., Neeman, J., Sly, A.: A proof of the blockmodel threshold conjecture. Combinatorica 38, 665–708 (2018)

    Article  MathSciNet  Google Scholar 

  19. Newman, M.E.J., Watts, D.J., Strogatz, S.H.: Random graph models of social networks. Proc. Natl. Acad. Sci. 99, 2566–2572 (2002)

    Article  ADS  Google Scholar 

  20. Tracy, C.A., Widom, H.: Distribution functions for largest eigenvalues and their applications. Proc. Int. Congr. Math. I, 587–596 (2002)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

ZL’s research is supported by National Science Foundation grant 1608896 and Simons Foundation Grant 638143.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhongyang Li.

Additional information

Communicated by Federico Ricci-Tersenghi.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Proof of Lemma 3.9

Appendix A: Proof of Lemma 3.9

It is known that for a standard Gaussian random variable \(G_i\) and \(x>0\),

$$\begin{aligned} \frac{xe^{-\frac{x^2}{2}}}{\sqrt{2\pi }(1+x^2)}\le \mathbf {Pr}(G_i>x)\le \frac{e^{-\frac{x^2}{2}}}{x\sqrt{2\pi }} \end{aligned}$$
(7.31)

Let \(G_1,\ldots , G_N\) be N standard Gaussian random variables. Then by (7.31) we have

$$\begin{aligned} \mathrm {Pr}\left( \max _{i\in [N]}G_i\ge (1+\epsilon )\sqrt{2\log N}\right)\le & {} \sum _{i\in [N]}\mathrm {Pr}\left( G_i\ge (1+\epsilon )\sqrt{2\log N}\right) \\\le & {} \frac{N e^{-(1+\epsilon )^2\log N}}{2(1+\epsilon )\sqrt{\pi \log N}}\\\le & {} N^{-\epsilon } \end{aligned}$$

If we further assume that \(G_i\)’s are independent, then

$$\begin{aligned} \mathrm {Pr}\left( \max _{i\in [N]}G_i< (1-\epsilon )\sqrt{2\log N}\right)= & {} \prod _{i\in [N]}\mathrm {Pr}\left( G_i<(1-\epsilon )\sqrt{2\log N}\right) \\= & {} \prod _{i\in [N]}\left[ 1-\mathrm {Pr}\left( G_i>(1-\epsilon )\sqrt{2\log N}\right) \right] \end{aligned}$$

By (7.31) we obtain

$$\begin{aligned} \mathrm {Pr}\left( \max _{i\in [N]}G_i< (1-\epsilon )\sqrt{2\log N}\right)\le & {} \left( 1-\frac{(1-\epsilon )\sqrt{2\log N}}{\sqrt{2\pi }(1+2(1-\epsilon )^2\log N)}\frac{1}{N^{(1-\epsilon )^2}}\right) ^N \end{aligned}$$

When (3.22) holds, we have

$$\begin{aligned} \mathrm {Pr}\left( \max _{i\in [N]}G_i< (1-\epsilon )\sqrt{2\log N}\right) \le \left( 1-\frac{1}{N^{1-\epsilon }}\right) ^{N^{1-\epsilon }\cdot N^{\epsilon }}\le e^{-N^{\epsilon }} \end{aligned}$$

Then the lemma follows.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Z. Exact Recovery of Community Detection in k-Partite Graph Models with Applications to Learning Electric Potentials in Electric Networks. J Stat Phys 182, 6 (2021). https://doi.org/10.1007/s10955-020-02690-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10955-020-02690-1

Keywords

Navigation