Abstract
Although the vertex-centric model originating from Pregel has been well studied, its applications are limited and biased. While PageRank, which can be formalized as a network flow problem, is one of the most common (or overused) applications, the maximum flow problem, which is the most fundamental problem on flow networks, has never been dealt with. In this work, in order to analyze the applicability of the vertex-centric model, we have implemented the push-relabel algorithm for efficiently solving the maximum flow problem on top of Pregel. This paper presents our implementation involving important heuristics, analyzes the applicability of the vertex-centric model, and reports experimental results on our implementation.
Similar content being viewed by others
Notes
In the terminology of the vertex-centric model, sleeping vertices are inactive, while waking ones are active. In this paper, we distinguish them from the notion of active vertices in the push-relabel algorithm (Sect. 3).
We can replace \(\infty \) with n by modifying the condition of active vertices.
That is, \({\text{O}}(\deg (v) \log ^k \deg (v))\) for some constant k.
References
Ahuja, R.K., Orlin, J.B., Stein, C., Tarjan, R.E.: Improved algorithms for bipartite network flow. SIAM J. Comput. 23(5), 906–933 (1994)
Anderson, T.E., Owicki, S.S., Saxe, J.B., Thacker, C.P.: High speed switch scheduling for local area networks. ACM Trans. Comput. Syst. 11(4), 319–352 (1993)
Batarfi, O., Shawi, R.E., Fayoumi, A.G., Nouri, R., Beheshti, S., Barnawi, A., Sakr, S.: Large scale graph processing systems: survey and an experimental evaluation. Clust. Comput. 18(3), 1189–1213 (2015)
Baumstark, N., Blelloch, G., Shun, J.: Efficient implementation of a synchronous parallel push–relabel algorithm. In: Proceedings of 23rd Annual European Symposium on Algorithms (ESA ’15), pp. 106–117. Springer (2015)
Chen, R., Shi, J., Chen, Y., Chen, H.: PowerLyra: differentiated graph computation and partitioning on skewed graphs. In: Proceedings of 20th European Conference on Computer Systems (EuroSys ’15), pp. 1:1–1:15. ACM (2015)
Cherkassky, B.V., Goldberg, A.V.: On implementing the push–relabel method for the maximum flow problem. Algorithmica 19(4), 390–410 (1997)
Ching, A., Edunov, S., Kabiljo, M., Logothetis, D., Muthukrishnan, S.: One trillion edges: graph processing at Facebook-scale. PVLDB 8(12), 1804–1815 (2015)
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51, 107–113 (2008)
Delong, A., Boykov, Y.: A scalable graph-cut algorithm for N-D Grids. In: Proceedings of 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR ’08). IEEE (2008)
Doekemeijer, N., Varbanescu, A.L.: A Survey of Parallel Graph Processing Frameworks. Tech. Rep. PDS-2014-003, Delft University of Technology (2014)
Goldberg, A.V.: Efficient graph algorithms for sequential and parallel computers. Ph.D. Thesis, Massachusetts Institute of Technology (1987)
Goldberg, A.V.: Two-level push–relabel algorithm for the maximum flow problem. In: Proceedings of 5th International Conference Algorithmic Aspects in Information and Management (AAAI ’09), pp. 212–225. Springer (2009)
Goldberg, A.V., Tarjan, R.E.: A new approach to the maximum-flow problem. J. ACM 35(4), 921–940 (1988)
Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: PowerGraph: distributed graph-parallel computation on natural graphs. In: Proceedings of 10th USENIX Conference on Operating Systems Design and Implementation (OSDI ’10), pp. 17–30. USENIX (2012)
Gonzalez, J.E., Xin, R.S., Dave, A., Crankshaw, D., Franklin, M.J., Stoica, I.: GraphX: graph processing in a distributed dataflow framework. In: Proceedings of 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’14), pp. 599–613. USENIX (2014)
Guo, Y., Biczak, M., Varbanescu, A.L., Iosup, A., Martella, C., Willke, T.L.: How well do graph-processing platforms perform? An empirical performance evaluation and analysis. In: Proceedings of 28th International Parallel and Distributed Processing Symposium (IPDPS ’14), pp. 395–404. IEEE (2014)
Halim, F., Yap, R.H.C., Wu, Y.: A MapReduce-based maximum-flow algorithm for large small-world network graphs. In: Proceedings of 2011 International Conference on Distributed Computing Systems (ICDCS ’11), pp. 192–202. IEEE (2011)
Iosup, A., Hegeman, T., Ngai, W.L., Heldens, S., Prat-Pérez, A., Manhardt, T., Chafi, H., Capota, M., Sundaram, N., Anderson, M.J., Tanase, I.G., Xia, Y., Nai, L., Boncz, P.A.: LDBC graphalytics: a benchmark for large-scale graph analysis on parallel and distributed platforms. PVLDB 9(13), 1317–1328 (2016)
Jiang, J., Wu, L.: Two-stage distributed parallel algorithm with message passing interface for maximum flow problem. J. Supercomput. 71(2), 629–647 (2015)
Kalavri, V., Vlassov, V., Haridi, S.: High-level programming abstractions for distributed graph processing. IEEE Trans. Knowl. Data Eng. 30(2), 305–324 (2018)
Kulkarni, M., Burtscher, M., Inkulu, R., Pingali, K., Cascaval, C.: How much parallelism is there in irregular applications? In: Proceedings of 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP ’09), pp. 3–14. ACM (2009)
Langewisch, R.P.: A performance study of an implementation of the push-relabel maximum flow algorithm in apache spark’s graphx. Master’s Thesis, Colorado School of Mines (2015)
Lu, Y., Cheng, J., Yan, D., Wu, H.: Large-scale distributed graph computing systems: an experimental evaluation. PVLDB 8(3), 281–292 (2014)
Malewicz, G., Austern, M.H., Bik, A.J.C., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: Proceedings of 2010 ACM SIGMOD International Conference on Management of Data (SIGMOD ’10), pp. 135–146. ACM (2010)
McCune, R.R., Weninger, T., Madey, G.: Thinking like a vertex: a survey of vertex-centric frameworks for large-scale distributed graph processing. ACM Comput. Surv. 48(2), 25:1–25:39 (2015)
Negruseri, C.S., Pacsosi, M.B., Stanley, B., Stein, C., Strat, C.G.: Solving maximum flow problems on real-world bipartite graphs. J. Exp. Algorithmics 16(3), 3.5:3.1–3.5:3.25 (2011)
Petroni, F., Querzoni, L., Daudjee, K., Kamali, S., Iacoboni, G.: HDRF: stream-based partitioning for power-law graphs. In: Proceedings of 24th ACM International on Conference on Information and Knowledge Management (CIKM ’15), pp. 243–252. ACM (2015)
Quamar, A., Deshpande, A., Lin, J.J.: NScale: neighborhood-centric large-scale graph analytics in the cloud. VLDB J. 25(2), 125–150 (2016)
Roy, A., Mihailovic, I., Zwaenepoel, W.: X-Stream: edge-centric graph processing using streaming partitions. In: Proceedings of ACM SIGOPS 24th Symposium on Operating Systems Principles (SOSP ’13), pp. 472–488. ACM (2013)
Salihoglu, S., Widom, J.: Optimizing graph algorithms on pregel-like systems. PVLDB 7(7), 577–588 (2014)
Satish, N., Sundaram, N., Patwary, M.M.A., Seo, J., Park, J., Hassaan, M.A., Sengupta, S., Yin, Z., Dubey, P.: Navigating the maze of graph analytics frameworks using massive graph datasets. In: Proceedings of 2014 ACM SIGMOD International Conference on Management of Data (SIGMOD ’14), pp. 979–990. ACM (2014)
Shekhovtsov, A., Hlaváč, V.: A distributed mincut/maxflow algorithm combining path augmentation and push-relabel. Int. J. Comput. Vis. 104(3), 315–342 (2013)
Tian, Y., Balmin, A., Corsten, S.A., Tatikonda, S., McPherson, J.: From “Think Like a Vertex” to “Think Like a Graph”. PVLDB 7(3), 193–204 (2013)
Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990)
Xie, C., Yan, L., Li, W.J., Zhang, Z.: Distributed power-law graph computing: theoretical and empirical analysis. In: Proceedings of 27th International Conference on Neural Information Processing Systems (NIPS ’14), pp. 1673–1681. MIT Press (2014)
Yan, D., Cheng, J., Lu, Y., Ng, W.: Blogel: a block-centric framework for distributed computation on real-world graphs. PVLDB 7(13), 1981–1992 (2014)
Yan, D., Cheng, J., Lu, Y., Ng, W.: Effective Techniques for Message Reduction and Load Balancing in Distributed Graph Computation. In: Proceedings of 24th International Conference on World Wide Web (WWW ’15), pp. 1307–1317. ACM (2015)
Zaharia, M., Xin, R.S., Wendell, P., Das, T., Armbrust, M., Dave, A., Meng, X., Rosen, J., Venkataraman, S., Franklin, M.J., Ghodsi, A., Gonzalez, J., Shenker, S., Stoica, I.: Apache spark: a unified engine for big data processing. Commun. ACM 59(11), 56–65 (2016)
Author information
Authors and Affiliations
Corresponding author
About this article
Cite this article
Sato, S. On Implementing the Push-Relabel Algorithm on Top of Pregel. New Gener. Comput. 36, 419–449 (2018). https://doi.org/10.1007/s00354-018-0042-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00354-018-0042-6