Skip to main content
Log in

GS4: Graph stream summarization based on both the structure and semantics

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Nowadays internet-based applications collect and distribute large datasets, which are mostly modeled by pertinent massive graphs. One solution to process such massive graphs is summarization. There are two kinds of graphs, stationary and stream. There are several algorithms to summarize stationary graphs; however, no comprehensive method has been devised to summarize stream graphs. This is because of the challenges of the graph stream, which are the high data volume and the continuous changes of data over time. To tackle such challenges, we propose a novel method based on the sliding window model that performs summarization using both the structure and vertex attributes of the input graph stream. We devise a new structure for a summary graph by considering the structural and semantical attributes that can better elucidate every heterogeneous summary graph. Moreover, our framework comprises innovative components for comparing hybrid summary graphs. To the best of our knowledge, this is the first method that summarizes a graph stream using both the structure and vertex attributes with varying contributions. Our approach also takes user directions and ontology into account. Aiming to study the efficiency and effectiveness of our proposed method, we conduct extensive experiments on two real-life datasets: American political web-logs and Amazon co-purchasing products. The experimental results confirm that compared to the existing approaches the proposed method generates graph summaries with better quality. The expected time of our proposed method in this paper (\(O(n^3)\)) has significantly enhanced the efficiency compared to the current best complexity which is \(O(n^5)\).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. https://snap.stanford.edu/data.

  2. https://scholar.google.com/citations?user=Q_kKkIUAAAAJ&hl=en.

  3. http://www-personal.umich.edu/~mejn/netdata.

References

  1. Cheng H, Zhou Y, Yu JX (2011) Clustering large attributed graphs: a balance between structural and attribute similarities. ACM Trans Knowl Discov Data (TKDD) 5(2):12

    MathSciNet  Google Scholar 

  2. LeFevre K, Terzi E (2010) GraSS: Graph structure summarization. In: Proceedings of the 2010 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, pp 454–465

  3. Shah N, Koutra D, Zou T, Gallagher B, Faloutsos C (2015) Timecrunch: Interpretable dynamic graph summarization. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, pp 1055–1064

  4. Facebook hits 2.27 billion monthly active users as earnings stabilize. https://www.nbcnews.com/tech/tech-news/facebook-hits-2-27-billion-monthly-active-users-earnings-stabilize-n926391. Accessed 20 Feb 2019

  5. Liu Y, Safavi T, Dighe A, Koutra D (2018) Graph summarization methods and applications: a survey. ACM Comput Surv (CSUR) 51(3):62

    Article  Google Scholar 

  6. Koutra D, Kang U, Vreeken J, Faloutsos C (2015) Summarizing and understanding large graphs. Stat Anal Min ASA Data Sci J 8(3):183–202

    Article  MathSciNet  Google Scholar 

  7. Navlakha S, Schatz MC, Kingsford C (2009) Revealing biological modules via graph summarization. J Comput Biol 16(2):253–264

    Article  Google Scholar 

  8. Thor A, Anderson P, Raschid L, Navlakha S, Saha B, Khuller S, Zhang XN (2011) Link prediction for annotation graphs using graph summarization. In: International Semantic Web Conference. Springer, Berlin, pp 714–729

  9. Riondato M, García-Soriano D, Bonchi F (2017) Graph summarization with quality guarantees. Data Min Knowl Discov 31(2):314–349

    Article  MathSciNet  Google Scholar 

  10. Navlakha S, Rastogi R, Shrivastava N (2008) Graph summarization with bounded error. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. ACM, pp 419–432

  11. Wu Y, Yang S, Srivatsa M, Iyengar A, Yan X (2013) Summarizing answer graphs induced by keyword queries. Proc VLDB Endow 6(14):1774–1785

    Article  Google Scholar 

  12. Bei Y, Lin Z, Chen D (2016) Summarizing scale-free networks based on virtual and real links. Phys A Stat Mech Appl 444:360–372

    Article  Google Scholar 

  13. Tian Y, Hankins RA, Patel JM (2008) Efficient aggregation for graph summarization. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. ACM, pp 567–580

  14. Zhang N, Tian Y, Patel JM (2010) Discovery-driven graph summarization. In: IEEE 26th International Conference on Data Engineering (ICDE 2010). IEEE, pp 880–891

  15. Sarma AD, Gollapudi S, Panigrahy R (2011) Estimating pagerank on graph streams. J ACM (JACM) 58(3):13

    Article  MathSciNet  Google Scholar 

  16. Feigenbaum J, Kannan S, McGregor A, Suri S, Zhang J (2008) Graph distances in the data-stream model. SIAM J Comput 38(5):1709–1727

    Article  MathSciNet  Google Scholar 

  17. Aggarwal CC, Zhao Y, Yu PS (2010) On clustering graph streams. In: Proceedings of the 2010 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, pp 478–489

  18. Gou X, Zou L, Zhao C, Yang T (2019) Fast and accurate graph stream summarization. In: IEEE 35th International Conference on Data Engineering (ICDE). IEEE, pp 1118–1129

  19. Hosseini S, Yin H, Zhang M, Elovici Y, Zhou X (2018) Mining subgraphs from propagation networks through temporal dynamic analysis. In: 19th IEEE International Conference on Mobile Data Management (MDM). IEEE, pp 66–75

  20. Hosseini S, Yin H, Cheung NM, Leng KP, Elovici Y, Zhou X (2018) Exploiting reshaping subgraphs from bilateral propagation graphs. In: International Conference on Database Systems for Advanced Applications. Springer, Cham, pp 342–351

  21. Ashrafi-Payaman N, Kangavari MR, Fander AM (2017) A new method for graph stream summarization based on both the structure and concepts. Open Eng 9(1):500–511

    Article  Google Scholar 

  22. Datar M, Motwani R (2007) The sliding-window computation model and results. In: Data streams. Springer, Boston, pp 149–167

  23. Liu X, Tian Y, He Q, Lee WC, McPherson J (2014) Distributed graph summarization. In: Proceedings of the 23rd ACM International Conference on Information and Knowledge Management. ACM, pp 799–808

  24. Chen C, Lin CX, Fredrikson M, Christodorescu M, Yan X, Han J (2009) Mining graph patterns efficiently via randomized summaries. Proc VLDB Endow 2(1):742–753

    Article  Google Scholar 

  25. Seo H, Park K, Han Y, Kim H, Umair M, Khan KU, Lee YK (2018) An effective graph summarization and compression technique for a large-scaled graph. J Supercomput. https://doi.org/10.1007/s11227-018-2245-5

    Article  Google Scholar 

  26. Tang N, Chen Q, Mitra P (2016) Graph stream summarization: from big bang to big crunch. In: Proceedings of the 2016 International Conference on Management of Data. ACM, pp 1481–1496

  27. Khan A, Bhowmick SS, Bonchi F (2017) Summarizing static and dynamic big graphs. Proc VLDB Endow 10(12):1981–1984

    Article  Google Scholar 

  28. Tsalouchidou I, Bonchi F, Morales GDF, Baeza-Yates R (2018) Scalable dynamic graph summarization. IEEE Trans Knowl Data Eng 32:360–373

    Article  Google Scholar 

  29. Lim Y, Kang U, Faloutsos C (2014) Slashburn: graph compression and mining beyond caveman communities. IEEE Trans Knowl Data Eng 26(12):3077–3089

    Article  Google Scholar 

  30. Boldi P, Vigna S (2004) The webgraph framework I: compression techniques. In: Proceedings of the 13th International Conference on World Wide Web. ACM, pp 595–602

  31. Seo H, Kim H, Park K, Han Y, Lee YK (2015) Summarization technique on a compressed graph for massive graph analysis. Korean Soc Big Data Serv 2(1):25–35

    Google Scholar 

  32. Jouili S, Mili I, Tabbone S (2009) Attributed graph matching using local descriptions. International Conference on Advanced Concepts for Intelligent Vision Systems. Springer, Berlin, pp 89–99

    Chapter  Google Scholar 

  33. Duchenne O, Joulin A, Ponce J (2011) A graph-matching kernel for object categorization. In: International Conference on Computer Vision. IEEE, pp 1792–1799

  34. Ashrafi-Payaman N, Kangavari M (2017) GSSC: Graph summarization based on both structure and concepts. Int J Inf Commun Technol Res 9(1):33–44

    Google Scholar 

  35. White S, Smyth P (2005) A spectral clustering approach to finding communities in graphs. In: Proceedings of the 2005 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, pp 274–285

  36. Von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416

    Article  MathSciNet  Google Scholar 

  37. Dhillon IS, Guan Y, Kulis B (2004) A unified view of kernel k-means, spectral clustering and graph cuts. Computer Science Department University of Texas at Austin, Austin

    Google Scholar 

  38. Van Dongen SM (2000) Graph clustering by flow simulation. Doctoral dissertation

  39. Ng AY, Jordan MI, Weiss Y (2002) On spectral clustering: analysis and an algorithm. In: Advances in neural information processing systems, pp 849–856

  40. Zhou D, Burges CJ (2007) Spectral clustering and transductive learning with multiple views. In: Proceedings of the 24th International Conference on Machine Learning. ACM, pp 1159–1166

  41. Liu J, Wang C, Danilevsky M, Han J (2013) Large-scale spectral clustering on graphs. In: Twenty-Third International Joint Conference on Artificial Intelligence

  42. Wang CD, Lai JH, Yu PS (2013) Dynamic community detection in weighted graph streams. In: Proceedings of the 2013 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, pp 151–161

  43. Prabavathi MG, Thiagarasu V (2013) Overlapping community detection algorithms in dynamic networks: an overview

  44. Lancichinetti A, Fortunato S (2009) Community detection algorithms: a comparative analysis. Phys Rev E 80(5):056117

    Article  Google Scholar 

  45. Wang W, Street WN (2014) A novel algorithm for community detection and influence ranking in social networks. In: IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014). IEEE, pp 555–560

  46. Benyahia O, Largeron C, Jeudy B (2017) Community detection in dynamic graphs with missing edges. In: 11th International Conference on Research Challenges in Information Science (RCIS). IEEE, pp 372–381

  47. Ashrafi-Payaman N, Kangavari MR (2018) Graph hybrid summarization. J AI Data Min 6(2):335–340

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Reza Kangavari.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ashrafi-Payaman, N., Kangavari, M.R., Hosseini, S. et al. GS4: Graph stream summarization based on both the structure and semantics. J Supercomput 77, 2713–2733 (2021). https://doi.org/10.1007/s11227-020-03290-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-020-03290-2

Keywords

Navigation