Skip to main content
Log in

Monoparametric Tiling of Polyhedral Programs

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

Tiling is a crucial program transformation, adjusting the ops-to-bytes balance of codes to improve locality. Like parallelism, it can be applied at multiple levels. Allowing tile sizes to be symbolic parameters at compile time has many benefits, including efficient autotuning, and run-time adaptability to system variations. For polyhedral programs, parametric tiling in its full generality is known to be non-linear, breaking the mathematical closure properties of the polyhedral model. Most compilation tools therefore either perform fixed size tiling, or apply parametric tiling in only the final, code generation step. We introduce monoparametric tiling, a restricted parametric tiling transformation. We show that, despite being parametric, it retains the closure properties of the polyhedral model. We first prove that applying monoparametric partitioning (i) to a polyhedron yields a union of polyhedra with modulo conditions, and (ii) to an affine function produces a piecewise-affine function with modulo conditions. We then use these properties to show how to tile an entire polyhedral program. Our monoparametric tiling is general enough to handle tiles with arbitrary tile shapes that can tesselate the iteration space (e.g., hexagonal, trapezoidal, etc). This enables a wide range of polyhedral analyses and transformations to be applied.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Notes

  1. Available at https://github.com/guillaumeiooss/MPP.

  2. An online demonstration is available at http://foobar.ens-lyon.fr/mppcodegen/index.php.

  3. See the online demonstration at http://foobar.ens-lyon.fr/mppcodegen/index.php.

  4. \(x \in [|a;b|]\) meaning \(x \in [a;b]\) and x is an integer.

  5. Our library is available at https://github.com/guillaumeiooss/MPP.

  6. Our compiler is available at http://foobar.ens-lyon.fr/mppcodegen/index.php and may be tried online.

  7. http://www.cs.colostate.edu/AlphaZsvn/Development/trunk/mde/edu.csu.melange.alphaz.polybench/polybench-alpha-4.0/.

  8. The full exploration framework and logs are available at https://guillaume.iooss.fr/CART/IJPP/explo_experiments.tar.gz.

References

  1. Acharya, A., Bondhugula, U.: Pluto+: near-complete modeling of affine transformations for parallelism and locality. SIGPLAN Not. 50(8), 54–64 (2015). https://doi.org/10.1145/2858788.2688512

    Article  Google Scholar 

  2. Achtziger, W., Zimmermann, K.H.: Finding quadratic schedules for affine recurrence equations via nonsmooth optimization. J. VLSI Signal Process. Syst. 25(3), 235–260 (2000). https://doi.org/10.1023/A:1008139706909

    Article  MATH  Google Scholar 

  3. Alias, C., Plesco, A.: Data-aware Process Networks. Research Report RR-8735, Inria—Research Centre Grenoble—Rhône-Alpes (2015). https://hal.inria.fr/hal-01158726

  4. Alias, C., Baray, F., Darte, A.: Bee+Cl@k: an implementation of lattice-based array contraction in the source-to-source translator Rose. In: ACM Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES’07) (2007)

  5. Amarasinghe, S.P.: Parallelizing compiler techniques based on linear inequalities. Ph.D. thesis, Stanford University (1997)

  6. Bandishti, V., Pananilath, I., Bondhugula, U.: Tiling stencil computations to maximize parallelism. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, pp. 1–11. IEEE Computer Society Press, Los Alamitos, CA, USA, SC ’12 (2012)

  7. Baskaran, M.M., Hartono, A., Tavarageri, S., Henretty, T., Ramanujam, J., Sadayappan, P.: Parameterized tiling revisited. In: Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization, pp. 200–209. ACM, New York, NY, USA, CGO ’10 (2010). https://doi.org/10.1145/1772954.1772983

  8. Bastoul, C.: Code generation in the polyhedral model is easier than you think. In: Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques, pp. 7–16. IEEE Computer Society (2004)

  9. Bastoul, C.: Code generation in the polyhedral model is easier than you think. In: Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques, pp. 7–16. IEEE Computer Society, Washington, DC, USA, PACT ’04 (2004). https://doi.org/10.1109/PACT.2004.11

  10. Bondhugula, U., Hartono, A., Ramanujam, J., Sadayappan, P.: A practical automatic polyhedral parallelizer and locality optimizer. In: Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 101–113. ACM, New York, NY, USA, PLDI ’08 (2008). https://doi.org/10.1145/1375581.1375595

  11. Bu, J., Deprettere, E.F., Dewilde, P.: A design methodology for fixed-size systolic arrays. In: Proceedings of the International Conference on Application Specific Array Processors, pp. 591–602. IEEE (1990)

  12. Darte, A.: Regular partitioning for synthesizing fixed-size systolic arrays. Integr. VLSI J. 12(3), 293–304 (1991)

    Article  Google Scholar 

  13. Darte, A., Schreiber, R., Villard, G.: Lattice-based memory allocation. In: Proceedings of the 2003 International Conference on Compilers, Architecture and Synthesis for Embedded Systems, pp. 298–308. ACM, New York, NY, USA, CASES ’03 (2003). https://doi.org/10.1145/951710.951749

  14. Feautrier, P.: Dataflow analysis of array and scalar references. Int. J. Parallel Prog. 20(1), 23–53 (1991). https://doi.org/10.1007/BF01407931

    Article  MATH  Google Scholar 

  15. Feautrier, P.: Some efficient solutions to the affine scheduling problem: I. One-dimensional time. Int. J. Parallel Program. 21(5), 313–348 (1992). https://doi.org/10.1007/BF01407835

    Article  MathSciNet  MATH  Google Scholar 

  16. Feautrier, P.: Some efficient solutions to the affine scheduling problem. Part II. Multidimensional time. Int. J. Parallel Program. 21(6), 389–420 (1992). https://doi.org/10.1007/BF01379404

    Article  MathSciNet  MATH  Google Scholar 

  17. Feautrier, P.: The power of polynomials. In: Jimborean, A., Darte, A. (eds.), 5th International Workshop on Polyhedral Compilation Techniques (IMPACT’15), Amsterdam, Netherlands, pp. 1–5 (2015)

  18. Frigo, M., Johnson, S.G.: FFTW: an adaptive software architecture for the FFT. In: Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 3, pp. 1381–1384. IEEE (1998)

  19. Grosser, T., Zheng, H., Aloor, R., Simbürger, A., Größlinger, A., Pouchet, L.N.: Polly—polyhedral optimization in LLVM. In: Alias, C., Bastoul, C. (eds.) 1st International Workshop on Polyhedral Compilation Techniques (IMPACT), Chamonix, France, pp. 1–6 (2011)

  20. Grosser, T., Cohen, A., Holewinski, J., Sadayappan, P., Verdoolaege, S.: Hybrid hexagonal/classical tiling for GPUs. In: Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization, pp. 66–75. ACM, New York, NY, USA, CGO ’14 (2014). https://doi.org/10.1145/2544137.2544160

  21. Grosslinger, A., Griebl, M., Lengauer, C.: Introducing non-linear parameters to the polyhedron model. Technische Universität München, Technical report (2004)

  22. Hartono, A., Baskaran, M.M., Bastoul, C., Cohen, A., Krishnamoorthy, S., Norris, B., Ramanujam, J., Sadayappan, P.: Parametric multi-level tiling of imperfectly nested loops. In: Proceedings of the 23rd International Conference on Supercomputing, pp. 147–157. ACM, New York, NY, USA, ICS ’09 (2009). https://doi.org/10.1145/1542275.1542301

  23. Hartono, A., Baskaran, M., Ramanujam, J., Sadayappan, P.: Dyntile: parametric tiled loop generation for parallel execution on multicore processors. In: International Symposium on Parallel Distributed Processing (IPDPS), pp. 1–12 (2010). https://doi.org/10.1109/IPDPS.2010.5470459

  24. Irigoin, F., Triolet, R.: Supernode partitioning. In: Proceedings of the 15th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL’88, pp. 319–329 (1988). https://doi.org/10.1145/73560.73588

  25. Kim, D., Rajopadhye, S.: Efficient tiled loop generation: D-tiling. In: Proceedings of the 22Nd International Conference on Languages and Compilers for Parallel Computing, pp. 293–307. Springer, Berlin, LCPC’09 (2010). https://doi.org/10.1007/978-3-642-13374-9_20

  26. Kim, D., Rajopadhye, S.V.: Parameterized tiling for imperfectly nested loops. Technical report CS-09-101, Colorado State University (2009)

  27. Kim, D., Renganarayanan, L., Rostron, D., Rajopadhye, S.V., Strout, M.M.: Multi-level tiling: M for the price of one. In: Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing, SC 2007, November 10–16, 2007, Reno, Nevada, USA, p. 51 (2007). https://doi.org/10.1145/1362622.1362691

  28. Kong, M., Pop, A., Pouchet, L.N., Govindarajan, R., Cohen, A., Sadayappan, P.: Compiler/runtime framework for dynamic dataflow parallelization of tiled programs. ACM Trans. Archit. Code Optim. 11(4), 61:1-61:30 (2015). https://doi.org/10.1145/2687652

    Article  Google Scholar 

  29. Krishnamoorthy, S., Baskaran, M., Bondhugula, U., Ramanujam, J., Rountev, A., Sadayappan, P.: Effective automatic parallelization of stencil computations. SIGPLAN Conf. Program. Lang. Des. Implement. 42(6), 235–244 (2007)

    Google Scholar 

  30. Lam, M.D., Rothberg, E.E., Wolf, M.E.: The cache performance and optimizations of blocked algorithms. ACM SIGARCH Comput. Archit. News 19, 63–74 (1991)

    Article  Google Scholar 

  31. Le Verge, H., Mauras, C., Quinton, P.: The ALPHA language and its use for the design of systolic arrays. J. VLSI Signal Proc. 3(3), 173–182 (1991)

    Article  Google Scholar 

  32. Loechner, V.: Polylib: A library for manipulating parameterized polyhedra (1999). https://repo.or.cz/polylib.git/blob_plain/HEAD:/doc/parampoly-doc.ps.gz

  33. Mauras, C.: ALPHA: un langage équationnel pour la conception et la programmation d’architectures parallèles synchrones. Ph.D. thesis, L’Université de Rennes I, IRISA, Campus de Beaulieu, Rennes, France (1989)

  34. Nookala, S.P.K., Risset, T.: A library for Z-polyhedral operations. Technical Report PI 1330, IRISA, Rennes (2000)

  35. Pop, S., Cohen, A., Bastoul, C., Girbal, S., Silber, GA., Vasilache, N.: GRAPHITE: loop optimizations based on the polyhedral model for GCC. In: Proceedings of the 4th GCC Developper’s Summit, Ottawa, Ontario, Unknown or Invalid Region, pp. 1–18 (2006)

  36. Püschel, M., Moura, J.M., Singer, B., Xiong, J., Johnson, J., Padua, D., Veloso, M., Johnson, R.W.: Spiral: a generator for platform-adapted libraries of signal processing algorithms. Int. J. High Perform. Comput. Appl. 18(1), 21–45 (2004)

    Article  Google Scholar 

  37. Quilleré, F., Rajopadhye, S.: Optimizing memory usage in the polyhedral model. ACM Trans. Program. Lang. Syst. TOPLAS 22(5), 773–815 (2000)

    Article  Google Scholar 

  38. Quilleré, F., Rajopadhye, S., Wilde, D.: Generation of efficient nested loops from polyhedra. Int. J. Parallel Prog. 28(5), 469–498 (2000). https://doi.org/10.1023/A:1007554627716

    Article  Google Scholar 

  39. Quinton, P., Van Dongen, V.: The mapping of linear recurrence equations on regular arrays. J. VLSI Signal Process. Syst. Signal Image Video Technol. 1(2), 95–113 (1989)

    Article  Google Scholar 

  40. Quinton, P., Rajopadhye, S.V., Risset, T.: On manipulating Z-polyhedra using a canonical representation. Parallel Process. Lett. 7, 181–194 (1997)

    Article  MathSciNet  Google Scholar 

  41. Rajopadhye, S.V., Purushothaman, S., Fujimoto, R.M.: On synthesizing systolic arrays from recurrence equations with linear dependencies. In: International Conference on Foundations of Software Technology and Theoretical Computer Science, pp. 488–503. Springer (1986)

  42. Reed, D.A., Adams, L.M., Partick, M.L.: Stencils and problem partitionings: their influence on the performance of multiple processor systems. IEEE Trans. Comput. 36(7), 845–858 (1987)

    Article  Google Scholar 

  43. Renganarayanan, L., Kim, D., Rajopadhye, S.V., Strout, M.M.: Parameterized tiled loops for free. In: Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation, pp. 405–414 (2007). https://doi.org/10.1145/1250734.1250780

  44. Renganarayanan, L., Kim, D., Rajopadhye, S.V., Strout, M.M.: Parameterized loop tiling. ACM Trans. Program. Lang. Syst. 34(1), 3 (2012). https://doi.org/10.1145/2160910.2160912

    Article  Google Scholar 

  45. Schreiber, R., Dongarra, J.J.: Automatic blocking of nested loops. University of Tennessee, Technical report (1990)

  46. Schrijver, A.: Theory of Linear and Integer Programming. Wiley, New York (1986)

    MATH  Google Scholar 

  47. Shivam, A., Nicolau, A., Veidenbaum, A.V., Furnari, M.M., Cammarota, R.: Polygonal iteration space partitioning. In: Criswell, J., Wu, P., Ding, C. (eds.) Languages and Compilers for Parallel Computing, pp. 121–136. Springer, New York (2017)

    Chapter  Google Scholar 

  48. Tavarageri, S., Hartono, A., Baskaran, M., Pouchet, L.N., Ramanujam, J., Sadayappan, P.: Parametric tiling of affine loop nests. In: 15th Workshop on Compilers for Parallel Computing (CPC’10), pp. 1–15. Austria, Vienna (2010)

  49. Teich, J., Thiele, L.: Partitioning of processor arrays: a piecewise regular approach. Integr. VLSI J. 14(3), 297–332 (1993)

    Article  Google Scholar 

  50. Trifunovic, K., Cohen, A., Edelsohn, D., Li, F., Grosser, T., Jagasia, H., Ladelsky, R., Pop, S., Sjödin, J., Upadrasta, R.: GRAPHITE two years after: first lessons learned from real-world polyhedral compilation. In: GCC Research Opportunities Workshop (GROW’10), Pisa, Italy (2010)

  51. Verdoolaege, S.: ISL: an integer set library for the polyhedral model. In: Fukuda, K., Hoeven, J., Joswig, M., Takayama, N. (eds.) Mathematical Software (ICMS’10), LNCS 6327, pp. 299–302. Springer, New York (2010)

    Google Scholar 

  52. Whaley, R.C., Dongarra, J.J.: Automatically tuned linear algebra software. In: Proceedings of the 1998 ACM/IEEE Conference on Supercomputing, pp. 1–27. IEEE Computer Society (1998)

  53. Wolf, M.E., Lam, M.: A data locality optimizing algorithm. In: ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), Totonto, CA (1991)

  54. Wolfe, M.: Iteration space tiling for memory hierarchies. In: Proceedings of the Third SIAM Conference on Parallel Processing for Scientific Computing, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, pp. 357–361 (1989). http://dl.acm.org/citation.cfm?id=645818.669220

  55. Xue, J.: Loop Tiling for Parallelism. Kluwer Academic Publishers, Norwell (2000)

    Book  Google Scholar 

  56. Yuki, T., Gupta, G., Kim, D., Pathan, T., Rajopadhye, S.V.: Alphaz: a system for design space exploration in the polyhedral model. In: Languages and Compilers for Parallel Computing, 25th International Workshop, LCPC 2012, pp. 17–31 (2012). https://doi.org/10.1007/978-3-642-37658-0_2

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guillaume Iooss.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Iooss, G., Alias, C. & Rajopadhye, S. Monoparametric Tiling of Polyhedral Programs. Int J Parallel Prog 49, 376–409 (2021). https://doi.org/10.1007/s10766-021-00694-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10766-021-00694-2

Keywords

Navigation