Abstract
When solving complex problems of numerical modeling, computational meshes, containing hundreds millions of cells are quite often. Modern tasks even cross the line of billion cells. Workstations are unable to cope with such volume of data and computation. To perform computations of this volume we need to use supercomputer clusters consisting of many computational nodes interconnected by a high-speed communication network. In this case, it is necessary to perform the decomposition of the computational mesh into separate domains in order to ensure its parallel processing on all nodes of the cluster. These domains are distributed among the computational nodes of the supercomputer and are processed independently of each other. To efficiently perform calculations and scale them to a large number of computational nodes, it is necessary to develop efficient algorithms for decomposition of computational meshes that generate many domains with imposed requirements. We consider an hierarchical decomposition algorithm with the choice of the optimal criterion for dividing mesh into domains. As such a mesh we study an unstructured surface mesh used to calculate the processes of interaction of a volumetric body with the environment. Using this decomposition algorithm, supercomputer calculations are performed on the computing resources of JSCC RAS in order to measure the practical indicators of scalability of highly loaded applications.
Similar content being viewed by others
REFERENCES
R. Fadeev, K. Ushakov, M. Tolstykh, R. Ibrayev, V. Shashkin, and G. Goyman, ‘‘Supercomputing the seasonal weather prediction,’’ in Supercomputing, Proceedings of the 5th Russian Supercomputing Days RuSCDays 2019, Ed. by V. Voevodin and S. Sobolev, Vol. 1129 of Commun. Comput. Inform. Sci. (Springer Nature, Switzerland AG, 2019).
Y. Hu, H. Yang, Z. Luan, L. Gan, G. Yang, and D. Qian, ‘‘Massively scaling seismic processing on Sunway TaihuLight supercomputer,’’ IEEE Trans. Parallel Distrib. Syst. 31, 1194–1208 (2020).
E. Golovchenko, E. Dorofeeva, I. Gasilova, and A. Boldareva, ‘‘Numerical experiments with new algorithms for parallel decomposition of large computational meshes,’’ Adv. Parallel Comput. 25, 441–450 (2014).
J. Dorris, J. Kurzak, and P. Luszczek, ‘‘Task-based Cholesky decomposition on Knights Corner using OpenMP,’’ in ISC High Performance, Lect. Notes Comput. Sci. 9945, 544–562 (2016).
B. Shabanov, A. Rybakov, and S. Shumilin, ‘‘Vectorization of high-performance scientific calculations using AVX-512 intruction set,’’ Lobachevskii J. Math. 40 (5), 580–598 (2019).
V. Kalantzis, ‘‘Data analytics, accelerators, and supercomputing: The challenges and future of MPI,’’ XRDS 23, 50–52 (2017).
A. Rybakov, ‘‘Distribution of the computational load between the nodes of a heterogeneous computational cluster,’’ Progr. Produkty Sist. Algoritmy 1, 1–7 (2018).
E. Golovchenko, ‘‘Review of graph decomposition algorithms,’’ KIAM Preprint No. 002 (Keldysh Inst. Appl. Math., 2020).
The Stanford 3D Scanning Repository. http://graphics.stanford.edu/data/3Dscanrep/. Accessed 2021.
W. Wright, P. Struck, T. Bartkus, and G. Addy, ‘‘Recent advances in the LEWICE icing model,’’ SAE Technical Paper (2015).
Y. Bourgault, H. Beaugendre, and W. Habashi, ‘‘Development of a shallow-water icing model in FENSAP-ICE,’’ J. Aircraft 37, 640–646 (2000).
A. Rybakov, ‘‘Inner respresentation and crossprocess exchange mechanism for block-structured grid for supercomputer calculations,’’ Program Produkty Sist. Algoritmy 8 (1), 121–134 (2017).
JSCC RAS Supercomputing Resources. http://www.jscc.ru/supercomputing-resources/. Accessed 2021.
Intel 64 and IA-32 Architectures Software Developer’s Manual (Intel Corp., 2019), Combined Vols.: 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C, 3D and 4.
G. Savin, B. Shabanov, A. Rybakov, and S. Shumilin, ‘‘Vectorization of flat loops of arbitrary structure using instructions AVX-512,’’ Lobachevskii J. Math. 41 (12), 2566–2574 (2020).
J. Jeffers, J. Reinders, and A. Sodani, Intel Xeon Phi Processor High Performance Programming, Knights Landing Edition (Morgan Kaufmann, 2016).
L. Benderskiy, D. Lyubimov, and A. Rybakov, ‘‘Analysis of scaling efficiency in high-speed turbulent flow calculations on a RANS/ILES supercomputer using the high resolution method,’’ Tr. SRISA RAS 7 (4), 32–40 (2017).
R. Van der Wijngaart, E. Georganas, T. Mattson, et al., ‘‘A new parallel research kernel to expand research on dynamic load-balancing capabilities,’’ in ISC High Performance, Lect. Notes Comput. Sci. 10266, 256–274 (2017).
Funding
The work has been done at the JSCC RAS as part of the state assignment for the topic 0580-2021-0016. The supercomputer MVS-10P OP (Broadwell, KNL, Skylake and Cascade Lake segments), located at the JSCC RAS, was used during the research.
Author information
Authors and Affiliations
Corresponding authors
Additional information
(Submitted by A. M. Elizarov)
Rights and permissions
About this article
Cite this article
Shabanov, B.M., Rybakov, A.A., Shumilin, S.S. et al. Scaling of Supercomputer Calculations on Unstructured Surface Computational Meshes. Lobachevskii J Math 42, 2571–2579 (2021). https://doi.org/10.1134/S1995080221110202
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S1995080221110202