Skip to main content
Log in

Clique-Based Method for Social Network Clustering

  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

In this article, we develop a clique-based method for social network clustering. We introduce a new index to evaluate the quality of clustering results, and propose an efficient algorithm based on recursive bipartition to maximize an objective function of the proposed index. The optimization problem is NP-hard, so we approximate the semi-optimal solution via an implicitly restarted Lanczos method. One of the advantages of our algorithm is that the proposed index of each community in the clustering result is guaranteed to be higher than some predetermined threshold, p, which is completely controlled by users. We also account for the situation that p is unknown. A statistical procedure of controlling both under-clustering and over-clustering errors simultaneously is carried out to select localized threshold for each subnetwork, such that the community detection accuracy is optimized. Accordingly, we propose a localized clustering algorithm based on binary tree structure. Finally, we exploit the stochastic blockmodels to conduct simulation studies and demonstrate the accuracy and efficiency of our algorithms, both numerically and graphically.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Airoldi, E.M., Blei, D.M., Fienberg, S.E., Xing, E.P. (2008). Mixed membership stochastic blockmodels. Journal of Machine Learning Research, 9, 1981–2014.

    MATH  Google Scholar 

  • Barabási, A.-L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286, 509–512.

    Article  MathSciNet  Google Scholar 

  • Bickel, P.J., & Chen, A. (2009). A nonparametric view of network models and Newman-Girvan and other modularities. Proceedings of the National Academy of Sciences of the United States of America, 106, 21068–21073.

    Article  Google Scholar 

  • Calvetti, D., Reichel, L., Sorensen, D. (1994). An implicitly restarted Lanczos method for large symmetric eigenvalue problems. Electronic Transactions on Numerical Analysis, 2, 1–21.

    MathSciNet  MATH  Google Scholar 

  • Chung, F.R.K. (1997). Spectral graph theory. Providence: American Mathematical Society.

    MATH  Google Scholar 

  • Clauset, A., Newman, M.E.J., Moore, C. (2004). Finding community structure in very large networks. Physical Review E, 70, 066111.

    Article  Google Scholar 

  • Erdös, P., & Rényi, A. (1959). On random graphs I. Publicationes Mathematicae, 6, 290–297.

    MathSciNet  MATH  Google Scholar 

  • Fortunato, S., & Barthélemy, M. (2007). Resolution limit in community detection. Proceedings of the National Academy of Sciences of the United States of America, 104, 36–41.

    Article  Google Scholar 

  • Fred, A., & Jain, A. (2003). Robust data clustering. Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2, 128–133.

    Google Scholar 

  • Gilbert, E.N. (1959). Random graphs. Annals of Mathematical Statistics, 30, 1141–1144.

    Article  MathSciNet  Google Scholar 

  • Goldenberg, A., Zheng, A.X., Fienberg, S.E., Airoldi, E.M. (2010). A survey of statistical network models. Foundations and Trends in Machine Learning, 2, 129–233.

    Article  Google Scholar 

  • Handcock, M.S., Raftery, A.E., Tantrum, J.M. (2007). Model-based clustering for social networks. Journal of the Royal Statistical Society, Series A, 170, 301–354.

    Article  MathSciNet  Google Scholar 

  • Hoff, P.D., Raftery, A.E., Handcock, M.S. (2002). Latent space approaches to social network analysis. Journal of the American Statistical Association, 97, 1090–1098.

    Article  MathSciNet  Google Scholar 

  • Holland, P.W., & Leinhardt, S. (1981). An exponential family of probability distributions for directed graphs. Journal of the American Statistical Association, 76, 33–50.

    Article  MathSciNet  Google Scholar 

  • Holland, P.W., Laskey, K.B., Leinhardt, S. (1983). Stochastic blockmodels: first steps. Social Networks, 5, 109–137.

    Article  MathSciNet  Google Scholar 

  • Horn, R.A., & Johnson, C.R. (1985). Matrix analysis. New York: Cambridge University Press.

    Book  Google Scholar 

  • Hubert, L., & Abrabie, P. (1985). Comparing partitions. Journal of Classification, 2, 193–218.

    Article  Google Scholar 

  • Lancichinetti, A., & Fortunato, S. (2011). Limits of modularity maximization in community detection. Physical Review E, 84, 066122.

    Article  Google Scholar 

  • Newman, M.E.J. (2001). The structure of scientific collaboration networks. Proceedings of the National Academy of Sciences of the United States of America, 98, 404–409.

    Article  MathSciNet  Google Scholar 

  • Newman, M.E.J. (2006). Modularity and community structure in networks. Proceedings of the National Academy of Sciences of the United States of America, 103, 8577–8582.

    Article  Google Scholar 

  • Newman, M.E.J., Strogatz, S.H., Watts, D.J. (2001). Random graphs with arbitrary degree distributions and their applications. Physical Review E, 64, 026118.

    Article  Google Scholar 

  • Ng, A.Y., Jordan, M.I., Weiss, Y. (2001). On spectral clustering: analysis and an algorithm. Advances in Neural Information Processing Systems, 14, 849–856.

    Google Scholar 

  • Snijders, T.A.B., & Nowicki, K. (1997). Estimation and prediction for stochastic blockmodels for graphs with latent block structure. Journal of Classification, 14, 75–100.

    Article  MathSciNet  Google Scholar 

  • Ouyang, G. (2015). Social network community detection. Ph.D.dissertation, University of Connecticut.

  • Pao, L.-F. (2014). Discovering the dynamics of smart business networks. Computational Management Science, 1, 445–458.

    Article  MathSciNet  Google Scholar 

  • Pei, X., Zhan, X. -X., Jin, Z. (2017). Application of pair approximation method to modeling and analysis of a marriage network. Applied Mathematics and Computation, 294, 280–293.

    Article  MathSciNet  Google Scholar 

  • Reichardt, J., & Bornholdt, S. (2006). Statistical mechanics of community detection. Physical Review E, 74, 016110.

    Article  MathSciNet  Google Scholar 

  • Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transaction on Pattern Analysis and Machine Intelligence, 22, 888–905.

    Article  Google Scholar 

  • Watts, D.J., & Strogatz, S.H. (1998). Collective dynamics of “small-world” networks. Nature, 440–442.

  • Wohlgemuth, J., & Matache, M.T. (2014). Small-wold properties of Facebook group networks. Complex Systems, 23, 197–225.

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the Associate Editor as well as two anonymous reviewers for their insightful comments and suggestions to this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Panpan Zhang.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ouyang, G., Dey, D.K. & Zhang, P. Clique-Based Method for Social Network Clustering. J Classif 37, 254–274 (2020). https://doi.org/10.1007/s00357-019-9310-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00357-019-9310-5

Keywords

Navigation