research-article

Scalable and axiomatic ranking of network role similarity

Authors:
Ruoming Jin

Kent State University, Kent, OH

Kent State University, Kent, OH
View Profile

,
Victor E. Lee

John Carroll University

John Carroll University
View Profile

,
Longjie Li

Lanzhou University

Lanzhou University
View Profile

Authors Info & Claims

ACM Transactions on Knowledge Discovery from Data Volume 8 Issue 1Article No.: 3pp 1–37https://doi.org/10.1145/2518176

Published:01 February 2014Publication History

ACM Transactions on Knowledge Discovery from Data

Abstract

A key task in analyzing social networks and other complex networks is role analysis: describing and categorizing nodes according to how they interact with other nodes. Two nodes have the same role if they interact with equivalent sets of neighbors. The most fundamental role equivalence is automorphic equivalence. Unfortunately, the fastest algorithms known for graph automorphism are nonpolynomial. Moreover, since exact equivalence is rare, a more meaningful task is measuring the role similarity between any two nodes. This task is closely related to the structural or link-based similarity problem that SimRank addresses. However, SimRank and other existing similarity measures are not sufficient because they do not guarantee to recognize automorphically or structurally equivalent nodes. This article makes two contributions. First, we present and justify several axiomatic properties necessary for a role similarity measure or metric. Second, we present RoleSim, a new similarity metric that satisfies these axioms and can be computed with a simple iterative algorithm. We rigorously prove that RoleSim satisfies all of these axiomatic properties. We also introduce Iceberg RoleSim, a scalable algorithm that discovers all pairs with RoleSim scores above a user-defined threshold θ. We demonstrate the interpretative power of RoleSim on both both synthetic and real datasets.

References

Ioannis Antonellis, Hector Garcia-Molina, and Chi-Chao Chang. 2008. Simrank++: Query rewriting through link analysis of the clickgraph. Proc. VLDB Endow. 1, 1, 408--421. Google ScholarDigital Library
D. Avis. 1983. A survey of heuristics for the weighted matching problem. Network 13, 475--493.Google ScholarCross Ref
Vladimir Batagelj, Patrick Doreian, and Anuška Ferligoj. 1992. An optimizational approach to regular equivalence. Social Networks 14, 121--135.Google ScholarCross Ref
Stephen P. Borgatti and Martin G. Everett. 1992. Notions of position in social network analysis. Sociological Methodology 22, 1--35.Google ScholarCross Ref
Stephen P. Borgatti and Martin G. Everett. 1993. Two algorithms for computing regular equivalence. Social Networks 15, 361--376.Google ScholarCross Ref
Yuanzhe Cai, Gao Cong, Xu Jia, Hongyan Liu, Jun He, Jiaheng Lu, and Xiaoyong Du. 2009. Efficient algorithm for computing link-based similarity in real world networks. In Ninth IEEE Int. Conf. Data Mining (ICDM). IEEE Computer Society, 734--739. Google ScholarDigital Library
Shai Carmi, Shlomo Havlin, Scott Kirkpatrick, Yuval Shavitt, and Eran Shir. 2007. A model of Internet topology using k-shell decomposition. In Proc. Nat’l Academy Sci. (PNAS) 104, 27, 11150--11154.Google ScholarCross Ref
Dragos M. Cvetkovíc, Michael Doob, and Horst Sachs. 1998. Spectra of Graphs: Theory and Applications, 3rd Revised and Enlarged Edition. Wiley.Google Scholar
Patrick Doreian, Vladimir Batagelj, and Anuška Ferligoj. 2005. Generalized Blockmodeling. Vol. 25. Cambridge University Press.Google Scholar
Natalia Dragan, Michael L. Collard, and Jonathan I. Maletic. 2009. Using method stereotype distribution as a signature descriptor for software systems. In IEEE Int. Conf. Software Maintenance (ICSM). IEEE, 567--570.Google Scholar
Martin G. Everett and Stephen P. Borgatti. 1996. Exact colorations of graphs and digraphs. Social Networks 18, 319--331.Google ScholarCross Ref
Dániel Fogaras and Balázs Rácz. 2005. Scaling link-based similarity search. In Proc. 14th Int. Conf. World Wide Web (WWW). ACM, 641--650. Google ScholarDigital Library
Scott Fortin. 1996. The Graph Isomorphism Problem. Technical Report TR 96-20. Dept. Computer Science, University of Alberta, Edmonton, Alberta, Canada.Google Scholar
Linton C. Freeman. 1977. A set of measures of centrality based on betweenness. Sociometry 40, 1, 35--41.Google ScholarCross Ref
Chris Godsil and Gordon Royle. 2001. Algebraic Graph Theory. Springer-Verlag.Google Scholar
Emilie M. Hafner-Burton, Miles Kahler, and Alexander H. Montgomery. 2009. Network analysis for international relations. International Organization 63, 3, 559--592.Google ScholarCross Ref
Petter Holme and Mikael Huss. 2005. Role-similarity based functional prediction in networked systems: Application to the yeast proteome. J. R. Soc. Interface 2, 4, 327--333.Google ScholarCross Ref
Glen Jeh and Jennifer Widom. 2002. SimRank: A measure of structural-context similarity. In Proc. 8th ACM SIGKDD Int. Conf. Knowledge Discovery Data Mining (KDD). ACM, 538--543. Google ScholarDigital Library
Xu Jia, Yuanzhe Cai, Hongyan Liu, Jun He, and Xiaoyong Du. 2009. Calculating similarity efficiently in a small world. In Proc. 5th Int. Conf. Advanced Data Mining Applications (ADMA). Springer-Verlag, Berlin, Heidelberg, 175--187. DOI: http://dx.doi.org/10.1007/978-3-642-03348-3_19 Google ScholarDigital Library
Ruoming Jin, Victor E. Lee, and Hui Hong. 2011. Axiomatic ranking of network role similarity. In KDD. ACM, 922--930. Google ScholarDigital Library
M. M. Kessler. 1963. Bibliographic coupling between scientific papers. American Documentation 14, 1, 10--25.Google ScholarCross Ref
H. W. Kuhn. 1955. The Hungarian method for the assignment problem. Naval Research Logistics Quarterly 2, 1--2, 83--97.Google ScholarCross Ref
Victor E. Lee, Ning Ruan, Ruoming Jin, and Charu Aggarwal. 2010. Managing and Mining Graph Data. Springer, Chapter 10: A survey of algorithms for dense subgraph discovery, 303--336.Google ScholarDigital Library
E. A. Leicht, Petter Holme, and Mark E. J. Newman. 2005. Vertex similarity in networks. Phys. Rev. E 73, 2, 026120.Google Scholar
Michael Ley, Marc Herbstritt, Marcel R. Ackermann, Oliver Hoffmann, Michael Wagner, Stefanie von Keutz, Katharina Hostert, and Doris Holzträger. 2012. The DBLP Computer Science Bibliography. Schloss Dagstuhl - Leibniz-Zentrum für Informatik. http://www.informatik.uni-trier.de/&sim;ley/db/.Google Scholar
Pei Li, Yuanzhe Cai, Hongyan Liu, Jun He, and Xiaoyong Du. 2009. Exploiting the block structure of link graph for efficient similarity computation. In Proc. 13th Pacific-Asia Conf. Advances Knowledge Discovery Data Mining (PAKDD). Springer-Verlag, Berlin, Heidelberg, 389--400. DOI: http://dx.doi.org/10.1007/978-3-642-01307-2_36 Google ScholarDigital Library
Zhenjiang Lin, Irwin King, and Michael R. Lyu. 2006. PageSim: A novel link-based similarity measure for the World Wide Web. In Proc. IEEE/WIC/ACM Int’l Conf. Web Intelligence. IEEE Computer Society, 687--693. Google ScholarDigital Library
Zhenjiang Lin, Michael R. Lyu, and Irwin King. 2007. Extending link-based algorithms for similar Web pages with neighborhood structure. In Proc. IEEE/WIC/ACM Int’l Conf. Web Intelligence. IEEE Computer Society, 263--266. http://www.cse.cuhk.edu.hk/&sim;king/PUB/WI2007_Lin.pdf. Google ScholarDigital Library
Zhenjiang Lin, Michael R. Lyu, and Irwin King. 2009. MatchSim: A novel neighbor-based similarity measure with maximum neighborhood matching. In Proc. 18th ACM Conf. Inform. Knowledge Manage. (CIKM). ACM, 1613--1616. Google ScholarDigital Library
Dmitry Lizorkin, Pavel Velikhov, Maxim Grinev, and Denis Turdakov. 2008. Accuracy estimate and optimization techniques for SimRank computation. In Proc. VLDB Endow. 1, 1, 422--433. DOI: http://dx.doi.org/10.1145/1453856.1453904 Google ScholarDigital Library
F. P. Lorrain and H. C. White. 1971. Structural equivalence of individuals in networks. J. Math. Sociology 1, 49--80.Google ScholarCross Ref
J. J. Luczkovich, Stephen P. Borgatti, J. C. Johnson, and Martin G. Everett. 2003. Defining and measuring trophic role similarity in food webs using regular coloration. J. Theoretical Biology 220, 3, 303--321.Google ScholarCross Ref
Ben D. MacArthur, Rubén J. Sánchez-García, and James W. Anderson. 2008. Note: Symmetry in complex networks. J. Discrete Applied Math. 156, 18, 3525--3531. Google ScholarDigital Library
Maarten Marx and Michael Masuch. 2003. Regular equivalence and dynamic logic. Social Networks 25, 1, 51--65.Google ScholarCross Ref
B. D. McKay. 1981. Practical graph isomorphism. Congressus Numerantium 30, 45--87.Google Scholar
Guy Melançon and Arnaud Sallaberry. 2008. Edge metrics for visual graph analytics: A comparative study. In Proc. 12th Int. Conf. Inform. Visual. IEEE Computer Society, 610--615. DOI: http://dx.doi.org/10.1109/IV.2008.10 Google ScholarDigital Library
Microsoft Research. 2012. Microsoft academic search. http://academic.research.microsoft.com/RankList&quest; entitytype=2&topdomainid=2&subdomainid=7. (2012). Accessed August 2012.Google Scholar
Mark Newman. 2006. Internet network. http://www-personal.umich.edu/&sim;mejn/netdata/.Google Scholar
Mark E. J. Newman. 2004. Coauthorship networks and patterns of scientific collaboration. In Proc. Nat’l Academy Sci. (PNAS) 101, Suppl 1, 5200--5205.Google ScholarCross Ref
Evelien Otte and Ronald Rousseau. 2002. Social network analysis: A powerful strategy, also for the information sciences. J. Information Science 28, 6, 441--453.Google ScholarCross Ref
Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank Citation Ranking: Bringing Order to the Web. Technical Report 1999-66. Stanford InfoLab. http://ilpubs.stanford.edu:8090/422/1/1999-66.pdf.Google Scholar
Ronald Read and Derek Corneil. 1977. The graph isomorphism disease. J. Graph Theory 1, 339--363.Google ScholarCross Ref
Michael Schultz and Mark Liberman. 1999. Topic detection and tracking using idf-weighted cosine coefficient. In Proc. DARPA Broadcast News Workshop. Morgan Kaufmann, 189--192.Google Scholar
Henry Small. 1973. Co-citation in the scientific literature: A new measure of the relationship between two documents. J. Amer. Soc. Information Sci. 24, 265--269.Google ScholarCross Ref
Malcolm K. Sparrow. 1993. A linear algorithm for computing automorphic equivalence classes: The numerical signatures approach. Social Networks 15, 2, 151--170. DOI: http://dx.doi.org/10.1016/0378-8733(93)90003-4Google ScholarCross Ref
Jie Tang, Jing Zhang, Limin Yao, and Juanzi Li. 2008. Extraction and mining of an academic social network. In Proc. 17th Int. Conf. World Wide Web (WWW). ACM, 1193--1194. DOI: http://dx.doi.org/10.1145/1367497.1367722 Google ScholarDigital Library
T. T. Tanimoto. 1958. An elementary mathematical theory of classification and prediction. IBM Taxonomy Application M. A. 6, 3.Google Scholar
Sudhir L. Tauro, Georgos Siganos, C. Palmer, and Michalis Faloutsos. 2001. A simple conceptual model for the Internet topology. In Proc. IEEE Global Telecomm. Conf. IEEE, 1667--1671.Google ScholarCross Ref
Yuchung J. Wang and George Y. Wong. 1987. Stochastic blockmodels for directed graphs. J. American Statistical Assoc. 82, 397, 8--19.Google ScholarCross Ref
Stanley Wasserman and Katherine Faust. 1994. Social Network Analysis: Methods and Applications. Cambridge University Press.Google Scholar
Douglas R. White and Karl P. Reitz. 1983. Graph and semigroup homomorphisms on networks of relations. Social Networks 5, 193--234.Google ScholarCross Ref
Harrison White, Scott Boorman, and Ronald Breiger. 1976. Social structure from multiple networks. I: Blockmodels of roles and positions. Am. J. Sociology 81, 730--780.Google ScholarCross Ref
Wensi Xi, Edward A. Fox, Weiguo Fan, Benyu Zhang, Zheng Chen, Jun Yan, and Dong Zhuang. 2005. SimFusion: Measuring similarity using unified relationship matrix. In Proc. 28th Int. ACM SIG Conf. Research Develop. Inform. Retrieval (SIGIR). ACM, 130--137. Google ScholarDigital Library
Erjia Yan and Ying Ding. 2009. Applying centrality measures to impact analysis: A coauthorship network analysis. J. Am. Soc. Information Sci. Technology 60, 10, 2107--2118. Google ScholarDigital Library
Xiaoxin Yin, Jiawei Han, and Philip S. Yu. 2006. LinkClus: Efficient clustering via heterogeneous semantic links. In Proc. 32nd Int. Conf. Very Large Data Bases (VLDB). VLDB Endowment, 427--438. Google ScholarDigital Library
Peixiang Zhao, Jiawei Han, and Yizhou Sun. 2009. P-Rank: A comprehensive structural similarity measure over information networks. In Proc. 18th ACM Conf. Inform. Knowledge Manage. (CIKM). ACM, 553--562. Google ScholarDigital Library

Index Terms

Scalable and axiomatic ranking of network role similarity

Recommendations

Axiomatic ranking of network role similarity
KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining

A key task in analyzing social networks and other complex networks is role analysis: describing and categorizing nodes by how they interact with other nodes. Two nodes have the same role if they interact with equivalent sets of neighbors. The most ...
Read More
RoleSim*: Scaling axiomatic role-based similarity ranking on large graphs
Abstract
RoleSim and SimRank are among the popular graph-theoretic similarity measures with many applications in, e.g., web search, collaborative filtering, and sociometry. While RoleSim addresses the automorphic (role) equivalence of pairwise similarity ...
Read More
ASCOS++: An Asymmetric Similarity Measure for Weighted Networks to Address the Problem of SimRank

In this article, we explore the relationships among digital objects in terms of their similarity based on vertex similarity measures. We argue that SimRank—a famous similarity measure—and its families, such as P-Rank and SimRank++, fail to capture ...
Read More

Reviews

Reviewer: Hector Zenil

The thorough survey of measures for network similarity makes for a fine paper. This paper aims to assess and dissect the main assumptions of a network role similarity metric. This is a metric that does not just match the topological properties of a network onto another one or achieve the same goal through graph theory; it finds nodes that may have similar roles to others in the way they are connected. Think of two families where the roles are clear: there are almost always two parents, yet the two families may have a different number of children and/or related family members. When comparing two networks, one may want to find the nodes playing the role of parent in both networks. Little by little, the authors explore and build the intuition for defining ranking metrics beyond SimRank that are up to the task. The paper is technically precise, highly understandable, and well written. Accessible to readers with only a little background in network science, this is essential reading for network scientists, whether they are new or in need of similarity metrics beyond more traditional ones such as the Jaccard index or link similarity, to mention two more traditional examples. The authors go on to introduce their own axiomatic role similarity metric, with full understanding of the current and previous literature on the subject. They aim to provide a sound measure with nothing but the essential properties for optimal network role similarity. Then they return to more established algorithms and test them against their axiomatization. They show, for example, that SimRank is not admissible because automorphism confirmation does not hold; this is also true for MatchSim. The authors claim that their RoleSim similarity measure, however, is optimal. They later proceed to experimental evaluation and concerns related to the time complexity of algorithms and other aspects, even testing on real-world networks (the Internet). The appendix is full of details, including theorems and proofs that support the claims in the main text. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Knowledge Discovery from Data Volume 8, Issue 1
Casin special issue
February 2014
157 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/2582178
Issue’s Table of Contents

Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 February 2014
- Accepted: 1 August 2013
- Revised: 1 April 2013
- Received: 1 September 2012
Published in tkdd Volume 8, Issue 1

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Complex network
automorphic equivalence
ranking
role similarity
social network
vertex similarity
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 20
  Total Citations
  View Citations
- 515
  Total Downloads
- Downloads (Last 12 months)22
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Scalable and axiomatic ranking of network role similarity

ACM Transactions on Knowledge Discovery from Data

Abstract

References

Cited By

Index Terms

Recommendations

Axiomatic ranking of network role similarity

RoleSim*: Scaling axiomatic role-based similarity ranking on large graphs

ASCOS++: An Asymmetric Similarity Measure for Weighted Networks to Address the Problem of SimRank

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Scalable and axiomatic ranking of network role similarity

ACM Transactions on Knowledge Discovery from Data

Abstract

References

Cited By

Index Terms

Recommendations

Axiomatic ranking of network role similarity

RoleSim*: Scaling axiomatic role-based similarity ranking on large graphs

ASCOS++: An Asymmetric Similarity Measure for Weighted Networks to Address the Problem of SimRank

Reviews

Access critical reviews of Computing literature here

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media