Abstract
Tiling is a crucial program transformation, adjusting the ops-to-bytes balance of codes to improve locality. Like parallelism, it can be applied at multiple levels. Allowing tile sizes to be symbolic parameters at compile time has many benefits, including efficient autotuning, and run-time adaptability to system variations. For polyhedral programs, parametric tiling in its full generality is known to be non-linear, breaking the mathematical closure properties of the polyhedral model. Most compilation tools therefore either perform fixed size tiling, or apply parametric tiling in only the final, code generation step. We introduce monoparametric tiling, a restricted parametric tiling transformation. We show that, despite being parametric, it retains the closure properties of the polyhedral model. We first prove that applying monoparametric partitioning (i) to a polyhedron yields a union of polyhedra with modulo conditions, and (ii) to an affine function produces a piecewise-affine function with modulo conditions. We then use these properties to show how to tile an entire polyhedral program. Our monoparametric tiling is general enough to handle tiles with arbitrary tile shapes that can tesselate the iteration space (e.g., hexagonal, trapezoidal, etc). This enables a wide range of polyhedral analyses and transformations to be applied.
Similar content being viewed by others
Notes
Available at https://github.com/guillaumeiooss/MPP.
An online demonstration is available at http://foobar.ens-lyon.fr/mppcodegen/index.php.
See the online demonstration at http://foobar.ens-lyon.fr/mppcodegen/index.php.
\(x \in [|a;b|]\) meaning \(x \in [a;b]\) and x is an integer.
Our library is available at https://github.com/guillaumeiooss/MPP.
Our compiler is available at http://foobar.ens-lyon.fr/mppcodegen/index.php and may be tried online.
The full exploration framework and logs are available at https://guillaume.iooss.fr/CART/IJPP/explo_experiments.tar.gz.
References
Acharya, A., Bondhugula, U.: Pluto+: near-complete modeling of affine transformations for parallelism and locality. SIGPLAN Not. 50(8), 54–64 (2015). https://doi.org/10.1145/2858788.2688512
Achtziger, W., Zimmermann, K.H.: Finding quadratic schedules for affine recurrence equations via nonsmooth optimization. J. VLSI Signal Process. Syst. 25(3), 235–260 (2000). https://doi.org/10.1023/A:1008139706909
Alias, C., Plesco, A.: Data-aware Process Networks. Research Report RR-8735, Inria—Research Centre Grenoble—Rhône-Alpes (2015). https://hal.inria.fr/hal-01158726
Alias, C., Baray, F., Darte, A.: Bee+Cl@k: an implementation of lattice-based array contraction in the source-to-source translator Rose. In: ACM Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES’07) (2007)
Amarasinghe, S.P.: Parallelizing compiler techniques based on linear inequalities. Ph.D. thesis, Stanford University (1997)
Bandishti, V., Pananilath, I., Bondhugula, U.: Tiling stencil computations to maximize parallelism. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, pp. 1–11. IEEE Computer Society Press, Los Alamitos, CA, USA, SC ’12 (2012)
Baskaran, M.M., Hartono, A., Tavarageri, S., Henretty, T., Ramanujam, J., Sadayappan, P.: Parameterized tiling revisited. In: Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization, pp. 200–209. ACM, New York, NY, USA, CGO ’10 (2010). https://doi.org/10.1145/1772954.1772983
Bastoul, C.: Code generation in the polyhedral model is easier than you think. In: Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques, pp. 7–16. IEEE Computer Society (2004)
Bastoul, C.: Code generation in the polyhedral model is easier than you think. In: Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques, pp. 7–16. IEEE Computer Society, Washington, DC, USA, PACT ’04 (2004). https://doi.org/10.1109/PACT.2004.11
Bondhugula, U., Hartono, A., Ramanujam, J., Sadayappan, P.: A practical automatic polyhedral parallelizer and locality optimizer. In: Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 101–113. ACM, New York, NY, USA, PLDI ’08 (2008). https://doi.org/10.1145/1375581.1375595
Bu, J., Deprettere, E.F., Dewilde, P.: A design methodology for fixed-size systolic arrays. In: Proceedings of the International Conference on Application Specific Array Processors, pp. 591–602. IEEE (1990)
Darte, A.: Regular partitioning for synthesizing fixed-size systolic arrays. Integr. VLSI J. 12(3), 293–304 (1991)
Darte, A., Schreiber, R., Villard, G.: Lattice-based memory allocation. In: Proceedings of the 2003 International Conference on Compilers, Architecture and Synthesis for Embedded Systems, pp. 298–308. ACM, New York, NY, USA, CASES ’03 (2003). https://doi.org/10.1145/951710.951749
Feautrier, P.: Dataflow analysis of array and scalar references. Int. J. Parallel Prog. 20(1), 23–53 (1991). https://doi.org/10.1007/BF01407931
Feautrier, P.: Some efficient solutions to the affine scheduling problem: I. One-dimensional time. Int. J. Parallel Program. 21(5), 313–348 (1992). https://doi.org/10.1007/BF01407835
Feautrier, P.: Some efficient solutions to the affine scheduling problem. Part II. Multidimensional time. Int. J. Parallel Program. 21(6), 389–420 (1992). https://doi.org/10.1007/BF01379404
Feautrier, P.: The power of polynomials. In: Jimborean, A., Darte, A. (eds.), 5th International Workshop on Polyhedral Compilation Techniques (IMPACT’15), Amsterdam, Netherlands, pp. 1–5 (2015)
Frigo, M., Johnson, S.G.: FFTW: an adaptive software architecture for the FFT. In: Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 3, pp. 1381–1384. IEEE (1998)
Grosser, T., Zheng, H., Aloor, R., Simbürger, A., Größlinger, A., Pouchet, L.N.: Polly—polyhedral optimization in LLVM. In: Alias, C., Bastoul, C. (eds.) 1st International Workshop on Polyhedral Compilation Techniques (IMPACT), Chamonix, France, pp. 1–6 (2011)
Grosser, T., Cohen, A., Holewinski, J., Sadayappan, P., Verdoolaege, S.: Hybrid hexagonal/classical tiling for GPUs. In: Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization, pp. 66–75. ACM, New York, NY, USA, CGO ’14 (2014). https://doi.org/10.1145/2544137.2544160
Grosslinger, A., Griebl, M., Lengauer, C.: Introducing non-linear parameters to the polyhedron model. Technische Universität München, Technical report (2004)
Hartono, A., Baskaran, M.M., Bastoul, C., Cohen, A., Krishnamoorthy, S., Norris, B., Ramanujam, J., Sadayappan, P.: Parametric multi-level tiling of imperfectly nested loops. In: Proceedings of the 23rd International Conference on Supercomputing, pp. 147–157. ACM, New York, NY, USA, ICS ’09 (2009). https://doi.org/10.1145/1542275.1542301
Hartono, A., Baskaran, M., Ramanujam, J., Sadayappan, P.: Dyntile: parametric tiled loop generation for parallel execution on multicore processors. In: International Symposium on Parallel Distributed Processing (IPDPS), pp. 1–12 (2010). https://doi.org/10.1109/IPDPS.2010.5470459
Irigoin, F., Triolet, R.: Supernode partitioning. In: Proceedings of the 15th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL’88, pp. 319–329 (1988). https://doi.org/10.1145/73560.73588
Kim, D., Rajopadhye, S.: Efficient tiled loop generation: D-tiling. In: Proceedings of the 22Nd International Conference on Languages and Compilers for Parallel Computing, pp. 293–307. Springer, Berlin, LCPC’09 (2010). https://doi.org/10.1007/978-3-642-13374-9_20
Kim, D., Rajopadhye, S.V.: Parameterized tiling for imperfectly nested loops. Technical report CS-09-101, Colorado State University (2009)
Kim, D., Renganarayanan, L., Rostron, D., Rajopadhye, S.V., Strout, M.M.: Multi-level tiling: M for the price of one. In: Proceedings of the ACM/IEEE Conference on High Performance Networking and Computing, SC 2007, November 10–16, 2007, Reno, Nevada, USA, p. 51 (2007). https://doi.org/10.1145/1362622.1362691
Kong, M., Pop, A., Pouchet, L.N., Govindarajan, R., Cohen, A., Sadayappan, P.: Compiler/runtime framework for dynamic dataflow parallelization of tiled programs. ACM Trans. Archit. Code Optim. 11(4), 61:1-61:30 (2015). https://doi.org/10.1145/2687652
Krishnamoorthy, S., Baskaran, M., Bondhugula, U., Ramanujam, J., Rountev, A., Sadayappan, P.: Effective automatic parallelization of stencil computations. SIGPLAN Conf. Program. Lang. Des. Implement. 42(6), 235–244 (2007)
Lam, M.D., Rothberg, E.E., Wolf, M.E.: The cache performance and optimizations of blocked algorithms. ACM SIGARCH Comput. Archit. News 19, 63–74 (1991)
Le Verge, H., Mauras, C., Quinton, P.: The ALPHA language and its use for the design of systolic arrays. J. VLSI Signal Proc. 3(3), 173–182 (1991)
Loechner, V.: Polylib: A library for manipulating parameterized polyhedra (1999). https://repo.or.cz/polylib.git/blob_plain/HEAD:/doc/parampoly-doc.ps.gz
Mauras, C.: ALPHA: un langage équationnel pour la conception et la programmation d’architectures parallèles synchrones. Ph.D. thesis, L’Université de Rennes I, IRISA, Campus de Beaulieu, Rennes, France (1989)
Nookala, S.P.K., Risset, T.: A library for Z-polyhedral operations. Technical Report PI 1330, IRISA, Rennes (2000)
Pop, S., Cohen, A., Bastoul, C., Girbal, S., Silber, GA., Vasilache, N.: GRAPHITE: loop optimizations based on the polyhedral model for GCC. In: Proceedings of the 4th GCC Developper’s Summit, Ottawa, Ontario, Unknown or Invalid Region, pp. 1–18 (2006)
Püschel, M., Moura, J.M., Singer, B., Xiong, J., Johnson, J., Padua, D., Veloso, M., Johnson, R.W.: Spiral: a generator for platform-adapted libraries of signal processing algorithms. Int. J. High Perform. Comput. Appl. 18(1), 21–45 (2004)
Quilleré, F., Rajopadhye, S.: Optimizing memory usage in the polyhedral model. ACM Trans. Program. Lang. Syst. TOPLAS 22(5), 773–815 (2000)
Quilleré, F., Rajopadhye, S., Wilde, D.: Generation of efficient nested loops from polyhedra. Int. J. Parallel Prog. 28(5), 469–498 (2000). https://doi.org/10.1023/A:1007554627716
Quinton, P., Van Dongen, V.: The mapping of linear recurrence equations on regular arrays. J. VLSI Signal Process. Syst. Signal Image Video Technol. 1(2), 95–113 (1989)
Quinton, P., Rajopadhye, S.V., Risset, T.: On manipulating Z-polyhedra using a canonical representation. Parallel Process. Lett. 7, 181–194 (1997)
Rajopadhye, S.V., Purushothaman, S., Fujimoto, R.M.: On synthesizing systolic arrays from recurrence equations with linear dependencies. In: International Conference on Foundations of Software Technology and Theoretical Computer Science, pp. 488–503. Springer (1986)
Reed, D.A., Adams, L.M., Partick, M.L.: Stencils and problem partitionings: their influence on the performance of multiple processor systems. IEEE Trans. Comput. 36(7), 845–858 (1987)
Renganarayanan, L., Kim, D., Rajopadhye, S.V., Strout, M.M.: Parameterized tiled loops for free. In: Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation, pp. 405–414 (2007). https://doi.org/10.1145/1250734.1250780
Renganarayanan, L., Kim, D., Rajopadhye, S.V., Strout, M.M.: Parameterized loop tiling. ACM Trans. Program. Lang. Syst. 34(1), 3 (2012). https://doi.org/10.1145/2160910.2160912
Schreiber, R., Dongarra, J.J.: Automatic blocking of nested loops. University of Tennessee, Technical report (1990)
Schrijver, A.: Theory of Linear and Integer Programming. Wiley, New York (1986)
Shivam, A., Nicolau, A., Veidenbaum, A.V., Furnari, M.M., Cammarota, R.: Polygonal iteration space partitioning. In: Criswell, J., Wu, P., Ding, C. (eds.) Languages and Compilers for Parallel Computing, pp. 121–136. Springer, New York (2017)
Tavarageri, S., Hartono, A., Baskaran, M., Pouchet, L.N., Ramanujam, J., Sadayappan, P.: Parametric tiling of affine loop nests. In: 15th Workshop on Compilers for Parallel Computing (CPC’10), pp. 1–15. Austria, Vienna (2010)
Teich, J., Thiele, L.: Partitioning of processor arrays: a piecewise regular approach. Integr. VLSI J. 14(3), 297–332 (1993)
Trifunovic, K., Cohen, A., Edelsohn, D., Li, F., Grosser, T., Jagasia, H., Ladelsky, R., Pop, S., Sjödin, J., Upadrasta, R.: GRAPHITE two years after: first lessons learned from real-world polyhedral compilation. In: GCC Research Opportunities Workshop (GROW’10), Pisa, Italy (2010)
Verdoolaege, S.: ISL: an integer set library for the polyhedral model. In: Fukuda, K., Hoeven, J., Joswig, M., Takayama, N. (eds.) Mathematical Software (ICMS’10), LNCS 6327, pp. 299–302. Springer, New York (2010)
Whaley, R.C., Dongarra, J.J.: Automatically tuned linear algebra software. In: Proceedings of the 1998 ACM/IEEE Conference on Supercomputing, pp. 1–27. IEEE Computer Society (1998)
Wolf, M.E., Lam, M.: A data locality optimizing algorithm. In: ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), Totonto, CA (1991)
Wolfe, M.: Iteration space tiling for memory hierarchies. In: Proceedings of the Third SIAM Conference on Parallel Processing for Scientific Computing, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, pp. 357–361 (1989). http://dl.acm.org/citation.cfm?id=645818.669220
Xue, J.: Loop Tiling for Parallelism. Kluwer Academic Publishers, Norwell (2000)
Yuki, T., Gupta, G., Kim, D., Pathan, T., Rajopadhye, S.V.: Alphaz: a system for design space exploration in the polyhedral model. In: Languages and Compilers for Parallel Computing, 25th International Workshop, LCPC 2012, pp. 17–31 (2012). https://doi.org/10.1007/978-3-642-37658-0_2
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Iooss, G., Alias, C. & Rajopadhye, S. Monoparametric Tiling of Polyhedral Programs. Int J Parallel Prog 49, 376–409 (2021). https://doi.org/10.1007/s10766-021-00694-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10766-021-00694-2