Abstract
In recent years, cohesive subgraph mining in bipartite graphs becomes a popular research topic. An important cohesive subgraph model k-bitruss is the maximal cohesive subgraph where each edge is contained in at least k butterflies (i.e., (2, 2)-bicliques). In this paper, we study the bitruss decomposition problem which aims to find all the k-bitrusses for \(k \ge 0\). The existing algorithms follow a bottom-up strategy which peels the edges with the lowest butterfly support iteratively. In this peeling process, these algorithms are time-consuming to enumerate all the supporting butterflies for each edge. To solve this issue, we propose a novel online index, the \(\mathsf {BE}\)-\(\mathsf {Index}\) which compresses butterflies into k-blooms (i.e., (2, k)-bicliques). Based on the \(\mathsf {BE}\)-\(\mathsf {Index}\), the new bitruss decomposition algorithm \(\mathsf {BiT}\)-\(\mathsf {BU}\) is proposed, along with two batch-based optimizations, to accomplish the butterfly enumeration of the peeling process efficiently. Furthermore, the \(\mathsf {BiT}\)-\(\mathsf {PC}\) algorithm is designed which is more efficient against handling the edges with high butterfly supports. Besides, we explore shared-memory parallel solutions to handle large graphs in a more efficient way. In the parallel algorithms, we propose effective techniques to reduce conflicts among threads. We theoretically show that our new algorithms significantly reduce the time complexities of the existing algorithms. In addition, extensive empirical evaluations are conducted on real-world datasets. The experimental results further validate the effectiveness of the bitruss model and demonstrate that our proposed solutions significantly outperform the state-of-the-art techniques by several orders of magnitude.
Similar content being viewed by others
References
Ahmed, A., Batagelj, V., Fu, X., Hong, S.-H., Merrick, D., Mrvar, A.: Visualisation and analysis of the internet movie database. In: 2007 6th International Asia-Pacific Symposium on Visualization, pp. 17–24. IEEE (2007)
Alexe, G., Alexe, S., Crama, Y., Foldes, S., Hammer, P.L., Simeone, B.: Consensus algorithms for the generation of all maximal bicliques. Discrete Appl. Math. 145(1), 11–21 (2004)
Batagelj, V., Zaversnik, M.: An o (m) algorithm for cores decomposition of networks. cs/0310049 (2003)
Beutel, A., Xu, W., Guruswami, V., Palow, C., Faloutsos, C.: Copycatch: stopping group attacks by spotting lockstep behavior in social networks. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 119–130. ACM (2013)
Bron, C., Kerbosch, J.: Finding all cliques of an undirected graph (algorithm 457). Commun. ACM 16(9), 575–576 (1973)
Cerinšek, M., Batagelj, V.: Generalized two-mode cores. Social Netw. 42, 80–87 (2015)
Chang, L.: Efficient maximum clique computation over large sparse graphs. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 529–538 (2019)
Chang, L., Yu, J.X., Qin, L.: Fast maximal cliques enumeration in sparse graphs. Algorithmica 66(1), 173–186 (2013)
Cheng, J., Ke, Y., Chu, S., Özsu, M.T.: Efficient core decomposition in massive networks. In: 2011 IEEE 27th International Conference on Data Engineering, pp. 51–62. IEEE (2011)
Chu, D., Zhang, F., Lin, X., Zhang, W., Zhang, Y., Xia, Y., Zhang, C.: Finding the best k in core decomposition: A time and space optimal solution. In: 2020 IEEE 36th International Conference on Data Engineering (ICDE), pp. 685–696. IEEE (2020)
Cohen, J.: Trusses: Cohesive subgraphs for social network analysis. Natl Secur. Agency Tech. Rep. 16, 1–3 (2008)
Danisch, M., Balalau, O., Sozio, M.: Listing k-cliques in sparse real-world graphs. In: Proceedings of the 2018 World Wide Web Conference, pp. 589–598 (2018)
Dasari, N.S., Desh, R., Zubair, M.: Park: An efficient algorithm for k-core decomposition on multicore processors. In: 2014 IEEE International Conference on Big Data (Big Data), pp. 9–16. IEEE (2014)
Ding, D., Li, H., Huang, Z., Mamoulis, N.: Efficient fault-tolerant group recommendation using alpha-beta-core. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 2047–2050. ACM (2017)
Fang, Y., Cheng, R., Chen, Y., Luo, S., Hu, J.: Effective and efficient attributed community search. VLDB J. 26(6), 803–828 (2017)
Fang, Y., Cheng, R., Li, X., Luo, S., Hu, J.: Effective community search over large spatial graphs. Proc. VLDB Endowment 10(6), 709–720 (2017)
Fang, Y., Cheng, R., Luo, S., Hu, J.: Effective community search for large attributed graphs. Proc. VLDB Endowment 9(12), 1233–1244 (2016)
Fang, Y., Cheng, R., Luo, S., Hu, J., Huang, K.: C-explorer: browsing communities in large graphs. Proc. VLDB Endowment 10(12), 1885–1888 (2017)
Fang, Y., Huang, X., Qin, L., Zhang, Y., Zhang, W., Cheng, R., Lin, X.: A survey of community search over big graphs. VLDB J. 29(1), 353–392 (2020)
Fang, Y., Wang, Z., Cheng, R., Li, X., Luo, S., Hu, J., Chen, X.: On spatial-aware community search. IEEE Trans. Knowl. Data Eng. (TKDE) 31(4), 783–798 (2019)
Ghafouri, M., Wang, K., Zhang, F., Zhang, Y., Lin, X.: Efficient graph hierarchical decomposition with user engagement and tie strength. In: International Conference on Database Systems for Advanced Applications, pp. 448–465. Springer (2020)
Giatsidis, C., Thilikos, D. M., Vazirgiannis, M.: Evaluating cooperation in communities with the k-core structure. In: 2011 International conference on advances in social networks analysis and mining, pp. 87–93. IEEE (2011)
He, Y., Wang, K., Zhang, W., Lin, X., Zhang, Y.: Exploring cohesive subgraphs with vertex engagement and tie strength in bipartite graphs. arXiv preprint arXiv:2008.04054 (2020)
Hochbaum, D.S.: Approximating clique and biclique problems. J. Algorithms 29(1), 174–200 (1998)
Huang, X., Cheng, H., Qin, L., Tian, W., Yu, J.X.: Querying k-truss community in large and dynamic graphs. In: Proceedings of the 2014 ACM SIGMOD international conference on Management of data, pp. 1311–1322 (2014)
Huang, X., Lakshmanan, L.V.: Attribute-driven community search. Proc. VLDB Endowment 10(9), 949–960 (2017)
Kabir, H., Madduri, K.: Shared-memory graph truss decomposition. In: 2017 IEEE 24th International Conference on High Performance Computing (HiPC), pp. 13–22. IEEE (2017)
Khaouid, W., Barsky, M., Srinivasan, V., Thomo, A.: K-core decomposition of large networks on a single pc. Proc. VLDB Endowment 9(1), 13–23 (2015)
Lee, V.E., Ruan, N., Jin, R., Aggarwal, C.: A survey of algorithms for dense subgraph discovery. In: Managing and Mining Graph Data, pp. 303–336. Springer (2010)
Li, C., Zhang, F., Zhang, Y., Qin, L., Zhang, W., Lin, X.: Efficient progressive minimum k-core search. Proc. VLDB Endowment 13(3), 362–375 (2019)
Li, Y., Kuboyama, T., Sakamoto, H.: Truss decomposition for extracting communities in bipartite graph. In: Third International Conference on Advances in Information Mining and Management, pp. 76–80 (2013)
Liu, B., Yuan, L., Lin, X., Qin, L., Zhang, W., Zhou, J.: Efficient (\(\alpha \), \(\beta \))-core computation: An index-based approach. In: The World Wide Web Conference, pp. 1130–1141. ACM (2019)
Lyu, B., Qin, L., Lin, X., Zhang, Y., Qian, Z., Zhou, J.: Maximum biclique search at billion scale. Proc. VLDB Endowment 13(9), 1359–1372 (2020)
Malliaros, F.D., Giatsidis, C., Papadopoulos, A.N., Vazirgiannis, M.: The core decomposition of networks: Theory, algorithms and applications. VLDB J. 29(1), 61–92 (2020)
Matula, D.W., Beck, L.L.: Smallest-last ordering and clustering and graph coloring algorithms. J. ACM (JACM) 30(3), 417–427 (1983)
Mitzenmacher, M., Pachocki, J., Peng, R., Tsourakakis, C., Xu, S. C.: Scalable large near-clique detection in large-scale networks via sampling. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 815–824. ACM (2015)
Morone, F., Del Ferraro, G., Makse, H.A.: The k-core as a predictor of structural collapse in mutualistic ecosystems. Nature Phys. 15(1), 95–102 (2019)
Mukherjee, A.P., Tirthapura, S.: Enumerating maximal bicliques from a large graph using mapreduce. IEEE Trans. Serv. Comput. 10(5), 771–784 (2016)
Nataraj, R., Selvan, S.: Parallel mining of large maximal bicliques using order preserving generators. Int. J. Comput. 8(3), 105–113 (2014)
Peng, Y., Zhang, Y., Lin, X., Zhang, W., Qin, L., Zhou, J.: Towards bridging theory and practice: hop-constrained st simple path enumeration. Proc. VLDB Endowment 13(4), 463–476 (2019)
Peng, Y., Zhang, Y., Zhang, W., Lin, X., Qin, L.: Efficient probabilistic k-core computation on uncertain graphs. In: 2018 IEEE 34th International Conference on Data Engineering (ICDE), pp. 1192–1203. IEEE (2018)
Qiu, X., Cen, W., Qian, Z., Peng, Y., Zhang, Y., Lin, X., Zhou, J.: Real-time constrained cycle detection in large dynamic graphs. Proc. VLDB Endowment 11(12), 1876–1888 (2018)
Saito, K., Yamada, T., Kazama, K.: Extracting communities from complex networks by the k-dense method. IEICE Trans. Fundam. Electron. Commun. Computer Sci. 91(11), 3304–3311 (2008)
Sanei-Mehri, S.-V., Sariyuce, A.E., Tirthapura, S.: Butterfly counting in bipartite networks. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2150–2159. ACM (2018)
Sarıyüce, A.E., Pinar, A.: Peeling bipartite networks for dense subgraph discovery. In: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp. 504–512. ACM (2018)
Sarıyüce, A.E., Seshadhri, C., Pinar, A.: Parallel local algorithms for core, truss, and nucleus decompositions. e-Print archive arXiv:1704.00386 (2017)
Seidman, S.B.: Network structure and minimum degree. Social Netw. 5(3), 269–287 (1983)
Shi, J., Shun, J.: Parallel algorithms for butterfly computations. In: 1st Symposium on Algorithmic Principles of Computer Systems, APOCS@SODA 2020, Salt Lake City, UT, USA, January 8, 2020, pp. 16–30 (2020)
Sim, K., Li, J., Gopalkrishnan, V., Liu, G.: Mining maximal quasi-bicliques: Novel algorithm and applications in the stock market and protein networks. Stat. Anal. Data Mining: The ASA Data Sci. J. 2(4), 255–273 (2009)
Smith, S., Liu, X., Ahmed, N. K., Tom, A. S., Petrini, F., Karypis, G.: Truss decomposition on shared-memory parallel systems. In: 2017 IEEE High Performance Extreme Computing Conference (HPEC), pp. 1–6. IEEE (2017)
Su, X., Khoshgoftaar, T. M.: A survey of collaborative filtering techniques. Advances in artificial intelligence (2009)
Wang, J., Cheng, J.: Truss decomposition in massive networks. Proc. VLDB Endowment 5(9), 812–823 (2012)
Wang, J., Cheng, J., Fu, A. W.-C.: Redundancy-aware maximal cliques. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 122–130 (2013)
Wang, J., Fu, A.W.-C., Cheng, J.: Rectangle counting in large bipartite graphs. In: 2014 IEEE International Congress on Big Data, pp. 17–24. IEEE (2014)
Wang, K., Cao, X., Lin, X., Zhang, W., Qin, L.: Efficient computing of radius-bounded k-cores. In: 2018 IEEE 34th International Conference on Data Engineering (ICDE), pp. 233–244. IEEE (2018)
Wang, K., Lin, X., Qin, L., Zhang, W., Zhang, Y.: Vertex priority based butterfly counting for large-scale bipartite networks. Proc. VLDB Endowment 12(10), 1139–1152 (2019)
Wang, K., Wang, S., Cao, X., Qin, L.: Efficient radius-bounded community search in geo-social networks. IEEE Trans. Knowl. Data Eng. (2020). https://doi.org/10.1109/TKDE.2020.3040172
Wang, K., Zhang, W., Lin, X., Zhang, Y., Qin, L., Zhang, Y.: Efficient and effective community search on large-scale bipartite graphs. In: 2020 IEEE 36th International Conference on Data Engineering (ICDE). IEEE (2021)
Yuan, L., Qin, L., Lin, X., Chang, L., Zhang, W.: Diversified top-k clique search. VLDB J. 25(2), 171–196 (2016)
Zhang, C., Zhang, F., Zhang, W., Liu, B., Zhang, Y., Qin, L., Lin, X.: Exploring finer granularity within the cores: Efficient (k, p)-core computation. In: 2020 IEEE 36th International Conference on Data Engineering (ICDE), pp. 181–192. IEEE (2020)
Zhang, C., Zhang, W., Zhang, Y., Qin, L., Zhang, F., Lin, X.: Selecting the optimal groups: Efficiently computing skyline k-cliques. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp. 1211–1220 (2019)
Zhang, F., Li, C., Zhang, Y., Qin, L., Zhang, W.: Finding critical users in social communities: The collapsed core and truss problems. IIEEE Trans. Knowl. Data Eng. (2018)
Zhang, F., Zhang, Y., Qin, L., Zhang,W., Lin, X.: Finding critical users for social network engagement: The collapsed k-core problem. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)
Zhang, F., Zhang, Y., Qin, L., Zhang, W., Lin, X.: When engagement meets similarity: Efficient (k, r)-core computation on social networks. Proc. VLDB Endowment 10(10), 998–1009 (2017)
Zhang, F., Zhang, Y., Qin, L., Zhang, W., Lin, X.: Efficiently reinforcing social networks over user engagement and tie strength. In: 2018 IEEE 34th International Conference on Data Engineering (ICDE), pp. 557–568. IEEE (2018)
Zhang, Y., Parthasarathy, S.: Extracting analyzing and visualizing triangle k-core motifs within networks. In: 2012 IEEE 28th International Conference on Data Engineering, pp. 1049–1060, IEEE (2012)
Zhang, Y., Phillips, C.A., Rogers, G.L., Baker, E.J., Chesler, E.J., Langston, M.A.: On finding bicliques in bipartite graphs: a novel algorithm and its application to the integration of diverse biological data types. BMC Bioinform. 15(1), 110 (2014)
Zhou, Z., Zhang, F., Lin, X., Zhang, W., Chen, C.: K-core maximization: An edge addition approach. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence, pp. 4867–4873. AAAI Press (2019)
Zou, Z.: Bitruss decomposition of bipartite graphs. In: International Conference on Database Systems for Advanced Applications, pp. 218–233. Springer (2016)
Acknowledgements
Xuemin Lin is supported by the National Key R&D Program of China under grant 2018AAA0102502 and ARC DP200101338. Lu Qin is supported by ARC FT200100787. Wenjie Zhang is supported by ARC DP210101393 and ARC DP200101116. Ying Zhang is supported by FT170100128 and ARC DP180103096.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, K., Lin, X., Qin, L. et al. Towards efficient solutions of bitruss decomposition for large-scale bipartite graphs. The VLDB Journal 31, 203–226 (2022). https://doi.org/10.1007/s00778-021-00658-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00778-021-00658-5