Abstract
A key task in analyzing social networks and other complex networks is role analysis: describing and categorizing nodes according to how they interact with other nodes. Two nodes have the same role if they interact with equivalent sets of neighbors. The most fundamental role equivalence is automorphic equivalence. Unfortunately, the fastest algorithms known for graph automorphism are nonpolynomial. Moreover, since exact equivalence is rare, a more meaningful task is measuring the role similarity between any two nodes. This task is closely related to the structural or link-based similarity problem that SimRank addresses. However, SimRank and other existing similarity measures are not sufficient because they do not guarantee to recognize automorphically or structurally equivalent nodes. This article makes two contributions. First, we present and justify several axiomatic properties necessary for a role similarity measure or metric. Second, we present RoleSim, a new similarity metric that satisfies these axioms and can be computed with a simple iterative algorithm. We rigorously prove that RoleSim satisfies all of these axiomatic properties. We also introduce Iceberg RoleSim, a scalable algorithm that discovers all pairs with RoleSim scores above a user-defined threshold θ. We demonstrate the interpretative power of RoleSim on both both synthetic and real datasets.
- Ioannis Antonellis, Hector Garcia-Molina, and Chi-Chao Chang. 2008. Simrank++: Query rewriting through link analysis of the clickgraph. Proc. VLDB Endow. 1, 1, 408--421. Google ScholarDigital Library
- D. Avis. 1983. A survey of heuristics for the weighted matching problem. Network 13, 475--493.Google ScholarCross Ref
- Vladimir Batagelj, Patrick Doreian, and Anuška Ferligoj. 1992. An optimizational approach to regular equivalence. Social Networks 14, 121--135.Google ScholarCross Ref
- Stephen P. Borgatti and Martin G. Everett. 1992. Notions of position in social network analysis. Sociological Methodology 22, 1--35.Google ScholarCross Ref
- Stephen P. Borgatti and Martin G. Everett. 1993. Two algorithms for computing regular equivalence. Social Networks 15, 361--376.Google ScholarCross Ref
- Yuanzhe Cai, Gao Cong, Xu Jia, Hongyan Liu, Jun He, Jiaheng Lu, and Xiaoyong Du. 2009. Efficient algorithm for computing link-based similarity in real world networks. In Ninth IEEE Int. Conf. Data Mining (ICDM). IEEE Computer Society, 734--739. Google ScholarDigital Library
- Shai Carmi, Shlomo Havlin, Scott Kirkpatrick, Yuval Shavitt, and Eran Shir. 2007. A model of Internet topology using k-shell decomposition. In Proc. Nat’l Academy Sci. (PNAS) 104, 27, 11150--11154.Google ScholarCross Ref
- Dragos M. Cvetkovíc, Michael Doob, and Horst Sachs. 1998. Spectra of Graphs: Theory and Applications, 3rd Revised and Enlarged Edition. Wiley.Google Scholar
- Patrick Doreian, Vladimir Batagelj, and Anuška Ferligoj. 2005. Generalized Blockmodeling. Vol. 25. Cambridge University Press.Google Scholar
- Natalia Dragan, Michael L. Collard, and Jonathan I. Maletic. 2009. Using method stereotype distribution as a signature descriptor for software systems. In IEEE Int. Conf. Software Maintenance (ICSM). IEEE, 567--570.Google Scholar
- Martin G. Everett and Stephen P. Borgatti. 1996. Exact colorations of graphs and digraphs. Social Networks 18, 319--331.Google ScholarCross Ref
- Dániel Fogaras and Balázs Rácz. 2005. Scaling link-based similarity search. In Proc. 14th Int. Conf. World Wide Web (WWW). ACM, 641--650. Google ScholarDigital Library
- Scott Fortin. 1996. The Graph Isomorphism Problem. Technical Report TR 96-20. Dept. Computer Science, University of Alberta, Edmonton, Alberta, Canada.Google Scholar
- Linton C. Freeman. 1977. A set of measures of centrality based on betweenness. Sociometry 40, 1, 35--41.Google ScholarCross Ref
- Chris Godsil and Gordon Royle. 2001. Algebraic Graph Theory. Springer-Verlag.Google Scholar
- Emilie M. Hafner-Burton, Miles Kahler, and Alexander H. Montgomery. 2009. Network analysis for international relations. International Organization 63, 3, 559--592.Google ScholarCross Ref
- Petter Holme and Mikael Huss. 2005. Role-similarity based functional prediction in networked systems: Application to the yeast proteome. J. R. Soc. Interface 2, 4, 327--333.Google ScholarCross Ref
- Glen Jeh and Jennifer Widom. 2002. SimRank: A measure of structural-context similarity. In Proc. 8th ACM SIGKDD Int. Conf. Knowledge Discovery Data Mining (KDD). ACM, 538--543. Google ScholarDigital Library
- Xu Jia, Yuanzhe Cai, Hongyan Liu, Jun He, and Xiaoyong Du. 2009. Calculating similarity efficiently in a small world. In Proc. 5th Int. Conf. Advanced Data Mining Applications (ADMA). Springer-Verlag, Berlin, Heidelberg, 175--187. DOI: http://dx.doi.org/10.1007/978-3-642-03348-3_19 Google ScholarDigital Library
- Ruoming Jin, Victor E. Lee, and Hui Hong. 2011. Axiomatic ranking of network role similarity. In KDD. ACM, 922--930. Google ScholarDigital Library
- M. M. Kessler. 1963. Bibliographic coupling between scientific papers. American Documentation 14, 1, 10--25.Google ScholarCross Ref
- H. W. Kuhn. 1955. The Hungarian method for the assignment problem. Naval Research Logistics Quarterly 2, 1--2, 83--97.Google ScholarCross Ref
- Victor E. Lee, Ning Ruan, Ruoming Jin, and Charu Aggarwal. 2010. Managing and Mining Graph Data. Springer, Chapter 10: A survey of algorithms for dense subgraph discovery, 303--336.Google ScholarDigital Library
- E. A. Leicht, Petter Holme, and Mark E. J. Newman. 2005. Vertex similarity in networks. Phys. Rev. E 73, 2, 026120.Google Scholar
- Michael Ley, Marc Herbstritt, Marcel R. Ackermann, Oliver Hoffmann, Michael Wagner, Stefanie von Keutz, Katharina Hostert, and Doris Holzträger. 2012. The DBLP Computer Science Bibliography. Schloss Dagstuhl - Leibniz-Zentrum für Informatik. http://www.informatik.uni-trier.de/∼ley/db/.Google Scholar
- Pei Li, Yuanzhe Cai, Hongyan Liu, Jun He, and Xiaoyong Du. 2009. Exploiting the block structure of link graph for efficient similarity computation. In Proc. 13th Pacific-Asia Conf. Advances Knowledge Discovery Data Mining (PAKDD). Springer-Verlag, Berlin, Heidelberg, 389--400. DOI: http://dx.doi.org/10.1007/978-3-642-01307-2_36 Google ScholarDigital Library
- Zhenjiang Lin, Irwin King, and Michael R. Lyu. 2006. PageSim: A novel link-based similarity measure for the World Wide Web. In Proc. IEEE/WIC/ACM Int’l Conf. Web Intelligence. IEEE Computer Society, 687--693. Google ScholarDigital Library
- Zhenjiang Lin, Michael R. Lyu, and Irwin King. 2007. Extending link-based algorithms for similar Web pages with neighborhood structure. In Proc. IEEE/WIC/ACM Int’l Conf. Web Intelligence. IEEE Computer Society, 263--266. http://www.cse.cuhk.edu.hk/∼king/PUB/WI2007_Lin.pdf. Google ScholarDigital Library
- Zhenjiang Lin, Michael R. Lyu, and Irwin King. 2009. MatchSim: A novel neighbor-based similarity measure with maximum neighborhood matching. In Proc. 18th ACM Conf. Inform. Knowledge Manage. (CIKM). ACM, 1613--1616. Google ScholarDigital Library
- Dmitry Lizorkin, Pavel Velikhov, Maxim Grinev, and Denis Turdakov. 2008. Accuracy estimate and optimization techniques for SimRank computation. In Proc. VLDB Endow. 1, 1, 422--433. DOI: http://dx.doi.org/10.1145/1453856.1453904 Google ScholarDigital Library
- F. P. Lorrain and H. C. White. 1971. Structural equivalence of individuals in networks. J. Math. Sociology 1, 49--80.Google ScholarCross Ref
- J. J. Luczkovich, Stephen P. Borgatti, J. C. Johnson, and Martin G. Everett. 2003. Defining and measuring trophic role similarity in food webs using regular coloration. J. Theoretical Biology 220, 3, 303--321.Google ScholarCross Ref
- Ben D. MacArthur, Rubén J. Sánchez-García, and James W. Anderson. 2008. Note: Symmetry in complex networks. J. Discrete Applied Math. 156, 18, 3525--3531. Google ScholarDigital Library
- Maarten Marx and Michael Masuch. 2003. Regular equivalence and dynamic logic. Social Networks 25, 1, 51--65.Google ScholarCross Ref
- B. D. McKay. 1981. Practical graph isomorphism. Congressus Numerantium 30, 45--87.Google Scholar
- Guy Melançon and Arnaud Sallaberry. 2008. Edge metrics for visual graph analytics: A comparative study. In Proc. 12th Int. Conf. Inform. Visual. IEEE Computer Society, 610--615. DOI: http://dx.doi.org/10.1109/IV.2008.10 Google ScholarDigital Library
- Microsoft Research. 2012. Microsoft academic search. http://academic.research.microsoft.com/RankList? entitytype=2&topdomainid=2&subdomainid=7. (2012). Accessed August 2012.Google Scholar
- Mark Newman. 2006. Internet network. http://www-personal.umich.edu/∼mejn/netdata/.Google Scholar
- Mark E. J. Newman. 2004. Coauthorship networks and patterns of scientific collaboration. In Proc. Nat’l Academy Sci. (PNAS) 101, Suppl 1, 5200--5205.Google ScholarCross Ref
- Evelien Otte and Ronald Rousseau. 2002. Social network analysis: A powerful strategy, also for the information sciences. J. Information Science 28, 6, 441--453.Google ScholarCross Ref
- Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank Citation Ranking: Bringing Order to the Web. Technical Report 1999-66. Stanford InfoLab. http://ilpubs.stanford.edu:8090/422/1/1999-66.pdf.Google Scholar
- Ronald Read and Derek Corneil. 1977. The graph isomorphism disease. J. Graph Theory 1, 339--363.Google ScholarCross Ref
- Michael Schultz and Mark Liberman. 1999. Topic detection and tracking using idf-weighted cosine coefficient. In Proc. DARPA Broadcast News Workshop. Morgan Kaufmann, 189--192.Google Scholar
- Henry Small. 1973. Co-citation in the scientific literature: A new measure of the relationship between two documents. J. Amer. Soc. Information Sci. 24, 265--269.Google ScholarCross Ref
- Malcolm K. Sparrow. 1993. A linear algorithm for computing automorphic equivalence classes: The numerical signatures approach. Social Networks 15, 2, 151--170. DOI: http://dx.doi.org/10.1016/0378-8733(93)90003-4Google ScholarCross Ref
- Jie Tang, Jing Zhang, Limin Yao, and Juanzi Li. 2008. Extraction and mining of an academic social network. In Proc. 17th Int. Conf. World Wide Web (WWW). ACM, 1193--1194. DOI: http://dx.doi.org/10.1145/1367497.1367722 Google ScholarDigital Library
- T. T. Tanimoto. 1958. An elementary mathematical theory of classification and prediction. IBM Taxonomy Application M. A. 6, 3.Google Scholar
- Sudhir L. Tauro, Georgos Siganos, C. Palmer, and Michalis Faloutsos. 2001. A simple conceptual model for the Internet topology. In Proc. IEEE Global Telecomm. Conf. IEEE, 1667--1671.Google ScholarCross Ref
- Yuchung J. Wang and George Y. Wong. 1987. Stochastic blockmodels for directed graphs. J. American Statistical Assoc. 82, 397, 8--19.Google ScholarCross Ref
- Stanley Wasserman and Katherine Faust. 1994. Social Network Analysis: Methods and Applications. Cambridge University Press.Google Scholar
- Douglas R. White and Karl P. Reitz. 1983. Graph and semigroup homomorphisms on networks of relations. Social Networks 5, 193--234.Google ScholarCross Ref
- Harrison White, Scott Boorman, and Ronald Breiger. 1976. Social structure from multiple networks. I: Blockmodels of roles and positions. Am. J. Sociology 81, 730--780.Google ScholarCross Ref
- Wensi Xi, Edward A. Fox, Weiguo Fan, Benyu Zhang, Zheng Chen, Jun Yan, and Dong Zhuang. 2005. SimFusion: Measuring similarity using unified relationship matrix. In Proc. 28th Int. ACM SIG Conf. Research Develop. Inform. Retrieval (SIGIR). ACM, 130--137. Google ScholarDigital Library
- Erjia Yan and Ying Ding. 2009. Applying centrality measures to impact analysis: A coauthorship network analysis. J. Am. Soc. Information Sci. Technology 60, 10, 2107--2118. Google ScholarDigital Library
- Xiaoxin Yin, Jiawei Han, and Philip S. Yu. 2006. LinkClus: Efficient clustering via heterogeneous semantic links. In Proc. 32nd Int. Conf. Very Large Data Bases (VLDB). VLDB Endowment, 427--438. Google ScholarDigital Library
- Peixiang Zhao, Jiawei Han, and Yizhou Sun. 2009. P-Rank: A comprehensive structural similarity measure over information networks. In Proc. 18th ACM Conf. Inform. Knowledge Manage. (CIKM). ACM, 553--562. Google ScholarDigital Library
Index Terms
- Scalable and axiomatic ranking of network role similarity
Recommendations
Axiomatic ranking of network role similarity
KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data miningA key task in analyzing social networks and other complex networks is role analysis: describing and categorizing nodes by how they interact with other nodes. Two nodes have the same role if they interact with equivalent sets of neighbors. The most ...
RoleSim*: Scaling axiomatic role-based similarity ranking on large graphs
AbstractRoleSim and SimRank are among the popular graph-theoretic similarity measures with many applications in, e.g., web search, collaborative filtering, and sociometry. While RoleSim addresses the automorphic (role) equivalence of pairwise similarity ...
ASCOS++: An Asymmetric Similarity Measure for Weighted Networks to Address the Problem of SimRank
In this article, we explore the relationships among digital objects in terms of their similarity based on vertex similarity measures. We argue that SimRank—a famous similarity measure—and its families, such as P-Rank and SimRank++, fail to capture ...
Comments