Skip to main content
Log in

A heterogeneous parallel Red–Black SOR technique and the numerical study on SIMPLE

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

A basic heterogeneous parallel Red–Black successive over-relaxation (SOR) implement, the mono-color floating-point scheme, was developed on graphics processing units (GPU) with OpenCL platform. Designed in fine granularity, compact data structure, and stencil function, a concise mapping relationship was created to implicitly describe the complex rules for searching neighbor elements, which could avoid low utilization of GPU in the traditional scheme of Red–Black SOR. The new mono-color floating-point scheme was applied to build fast Semi-Implicit Method for Pressure Linked Equations (SIMPLE) solver with OpenCL and OpenMP on the heterogeneous parallel computing device. Compared with SIMPLE solver in the traditional Red–Black SOR scheme, the new scheme can achieve 1.7 to 1.8 faster accelerative performance on the same GPU. And this scheme can eliminate the complex searching module in mono-color logical scheme and behave better than the mono-color logical scheme by 20–30% acceleration. Numerical cases in double precision showed that SIMPLE solver on GPU with new scheme of Red–Black SOR could save up to 92% computing time compared with the serial solver on CPU.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Abbreviations

Bi, Ri :

The black element, the red element

C :

Specific heat (J/kg K)

CB, MB, SB :

Coefficients for black elements

CR, MR, SR :

Coefficients for red elements

Ek, Wk, Nk, Sk :

Boundary elements

g :

Gravitational acceleration (m/s2)

g :

Thermal conductivity W (m K)1

L :

Length (m)

L5, si :

The stencil function

M, N, n :

Size of the computational matrix

Nu:

Nusselt number

Nux0 :

Local Nusselt number

Numean :

Mean Nusselt number

p :

Pressure (Pa)

Pr:

Prandtl number

Ra:

Rayleigh number

Re:

Reynolds number

S :

Source item

T :

Temperature (K)

T C :

Temperature of cold wall (K)

T H :

Temperature of hot wall (K)

T m :

Reference temperature (K)

T * :

Dimensionless temperature

t :

Time (s)

U :

Velocity vector

u :

Velocity in the coordinate x (m s1)

u * :

Dimensionless velocity of u

v :

Velocity in the coordinate y (m s1)

x, y :

Cartesian coordinate (m)

β :

Coefficient of thermal expansion (K−1)

ρ :

Density (kg/m3)

μ :

Dynamic viscosity (Pa s)

ϕ :

Generic variable

i :

Index

E, W, N, S, NB :

Index of neighbor element

P :

Index of central element

C :

Cold wall

H :

Hot wall

References

  1. Trottenberg U, Oosterlee CW, Schuller A (2000) Multigrid. Elsevier, Orlando, pp 289–355

    MATH  Google Scholar 

  2. Karniadakis G, Sherwin S (1999) Spectral/hp element methods for computational fluid dynamics. Oxford University Press, New York, pp 238–268

    Google Scholar 

  3. Liesen J, Strakos Z (2012) Krylov subspace methods: principles and analysis. Oxford University Press, Oxford, pp 12–70

    Book  Google Scholar 

  4. Roy P, Anand NK, Donzis D (2015) A parallel multigrid finite-volume solver on a collocated grid for incompressible Navier–Stokes equations. Numer Heat Transf B 67(5):376–409

    Article  Google Scholar 

  5. Munshi A, Gaster BR, Mattson TG, Fung J, Ginsburg D (2011) OpenCL programming guide. Addison-Wesley, New York, pp 3–36

    Google Scholar 

  6. Kandrot E, Sanders J (2010) CUDA by example: an introduction to general-purpose GPU programming. Addison-Wesley Longman, Amsterdam, pp 9–10

    Google Scholar 

  7. Niemeyer KE, Sung CJ (2014) Recent progress and challenges in exploiting graphics processors in computational fluid dynamics. J Supercomput 67(2):528–564

    Article  Google Scholar 

  8. Xian W, Takayuki A (2011) Multi-GPU performance of incompressible flow computation by lattice Boltzmann method on GPU cluster. Parallel Comput 37(9):521–535

    MathSciNet  Google Scholar 

  9. Anderson JA, Jankowski E, Grubb TL, Engel M, Glotzer SC (2013) Massively parallel monte carlo for many-particle simulations on GPUs. J Comput Phys 254(12):27–38

    Article  MathSciNet  Google Scholar 

  10. Yang J, Wang Y, Chen Y (2007) GPU accelerated molecular dynamics simulation of thermal conductivities. J Comput Phys 221(2):799–804

    Article  Google Scholar 

  11. SP. Vanka, AF. Shinn, KC. Sahu, Computational Fluid Dynamics Using Graphics Processing Units: Challenges and Opportunities, In: Proceedings of the ASME 2011 international mechanical engineering congress and exposition, ASME, Denver, Colorado, USA, 2011, pp 429–437

  12. Kindratenko V (2014) Numerical computations with GPUs. Springer International Publishing, Switzerland, pp 125–338

    MATH  Google Scholar 

  13. Zhang Y, Cohen J, Owens JD (2010) Fast tridiagonal solvers on the GPU. ACM, Bangalore, India, ACM Sigplan Symposium on Principles and Practice of Parallel Programming, pp 127–136

    Google Scholar 

  14. J. Williams, C. Sarofeen, H. Shan, M. Conley, An accelerated iterative linear solver with gpus for cfd calculations of unstructured grids, The International Conference on Computational Science, ICCS 2016, Procedia Computer Science, San Diego, California, USA, 2016, pp 1291–1300

  15. Thibault J, Senocak I (2009) CUDA implementation of a Navier–Stokes solver on multi-GPU desktop platforms for incompressible flows. In: 47th AIAA aerosp. orlando, American Institute of Aeronautics and Astronautics, Florida, USA, Sciences. Meeting. Including. New Horizons Forum Aerosp. Expo., p 758

    Google Scholar 

  16. Adams L, Ortega JM (1982) A multicolor SOR method for parallel computation. Icpp 8(3):23–28

    Google Scholar 

  17. Itu LM, Suciu C, Moldoveanu F, Postelnicu A (2011) GPU optimized computation of stencil based algorithms, 2011 RoEduNet In: International Conference 10th Edition: Networking in Education and Research. IEEE, Iasi, Romania, pp 1–6

    Google Scholar 

  18. Liu JT, Ma ZS, Li SH, Zhao Y (2011) A GPU accelerated Red–Black SOR algorithm for computational fluid dynamics problems. Adv Mater Res 320:335–340

    Article  Google Scholar 

  19. A. Vizitiu, L. Itu, C. Niţă, C. Suciu, Optimized three-dimensional stencil computation on Fermi and Kepler GPUs, High Performance Extreme Computing Conference, IEEE, Waltham, Massachusetts, USA, 2015, pp 1–6

  20. Elmaghrbay M, Ammar R, Rajasekaran S (2014) Fast GPU algorithms for implementing the Red–Black Gauss-Seidel method for solving partial differential equations. In: 2013 IEEE Symposium on computers and communications, IEEE, Split, Croatia, pp 000269–000274

  21. Cotronis Y, Konstantinidis E, Louka MA, Missirlis NM (2014) A comparison of CPU and GPU implementations for solving the convection diffusion equation using the local modified SOR method. Parallel Comput 40(7):173–185

    Article  MathSciNet  Google Scholar 

  22. Wan F, Yin Y, Zhang S (2018) 3D parallel multigrid methods for real-time fluid simulation. 3D Res 9(1):8

    Article  Google Scholar 

  23. Fernandez G, Mendina M, Usera G (2020) Heterogeneous Computing (CPU–GPU) for pollution dispersion in an urban environment. Computation 8(1):3

    Article  Google Scholar 

  24. Konstantinidis E, Cotronis Y (2013) Graphics processing unit acceleration of the red/black SOR method. Concurr Comput Pract Exp 25(8):1107–1120

    Article  Google Scholar 

  25. Patankar SV, Spalding DB (1972) A calculation procedure for heat, mass and momentum transfer in three-dimensional parabolic flows. Int J Heat Mass Transf 15(10):1787–1806

    Article  Google Scholar 

  26. Emans M, Liebmann M (2013) Velocity–pressure coupling on GPUs. Comput 95(1):123–143

    Article  Google Scholar 

  27. Shinn AF, Vanka SP (2009) Implementation of a semi-implicit pressure-based multigrid fluid flow algorithm on a graphics processing unit, ASME 2009 international mechanical engineering congress and exposition. ASME, Lake Buena Vista, Fla, USA 13:125–133

    Google Scholar 

  28. Xiang Y, Yu B, Yuan Q, Sun DL (2017) GPU Acceleration of CFD algorithm: HSMAC and SIMPLE. Procedia Comput Sci 108:1982–1989

    Article  Google Scholar 

  29. Patankar SV (1980) Numerical heat transfer and fluid flow. Hemisphere Pub. Corp, Washington, D.C., pp 113–135

    MATH  Google Scholar 

  30. Ghia U, Ghia KN, Shin CT (1982) High-Re solutions for incompressible flow using the Navier-Stokes equations and a multigrid method. J Comput Phys 48(3):387–411

    Article  Google Scholar 

  31. Barakos G, Mitsoulis E, Assimacopoulos D (1994) Natural convection flow in a square cavity revisited: laminar and turbulent models with wall functions. Int J Numer Methods Fluids 18(7):695–719

    Article  Google Scholar 

  32. Davis GDV (1983) Natural convection of air in a square cavity: a bench mark numerical solution. Int J Numer Methods Fluids 3(3):249–264

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Nature Science Foundation of China (No.51276199).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Minghai Xu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, R., Gong, L. & Xu, M. A heterogeneous parallel Red–Black SOR technique and the numerical study on SIMPLE. J Supercomput 76, 9585–9608 (2020). https://doi.org/10.1007/s11227-020-03221-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-020-03221-1

Keyword

Navigation