样式: 排序: IF: - GO 导出 标记为已读
-
DGEMM on integer matrix multiplication unit Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2024-03-16 Hiroyuki Ootomo, Katsuhisa Ozaki, Rio Yokota
Deep learning hardware achieves high throughput and low power consumption by reducing computing precision and specializing in matrix multiplication. For machine learning inference, fixed-point value computation is commonplace, where the input and output values and the model parameters are quantized. Thus, many processors are now equipped with fast integer matrix multiplication units (IMMU). It is of
-
Malleability techniques applications in high-performance computing Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2024-03-11 Jesus Carretero, Estela Suarez, Martin Schulz
-
Accelerating atmospheric physics parameterizations using graphics processing units Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2024-03-09 Daniel S Abdi, Isidora Jankov
As part of a project aimed at exploring the use of next-generation high-performance computing technologies for numerical weather prediction, we have ported two physics modules from the Common Community Physics Package (CCPP) to Graphics Processing Unit (GPU) and obtained accelerations of up to 10× relative to a comparable multi-core CPU. The physics parameterizations accelerated in this work are the
-
Performance of explicit and IMEX MRI multirate methods on complex reactive flow problems within modern parallel adaptive structured grid frameworks Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2024-02-26 John J. Loffeld, Andy Nonaka, Daniel R. Reynolds, David J. Gardner, Carol S. Woodward
Large-scale multiphysics simulations are computationally challenging due to the coupling of multiple processes with widely disparate time scales. The advent of exascale computing systems exacerbates these challenges since these systems enable ever-increasing size and complexity. In recent years, there has been renewed interest in developing multirate methods as a means to handle the large range of
-
High performance computing seismic redatuming by inversion with algebraic compression and multiple precisions Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2024-01-31 Yuxi Hong, Hatem Ltaief, Matteo Ravasi, David Keyes
We present a high-performance implementation of Seismic Redatuming by Inversion (SRI), which combines algebraic compression with mixed-precision (MP) computations. Seismic redatuming entails the repositioning of seismic data recorded at the surface of the Earth to a subsurface level closer to where reflections have originated. Marchenko-based redatuming is the best-in-class SRI method from a theoretical
-
IO-aware Job-Scheduling: Exploiting the Impacts of Workload Characterizations to select the Mapping Strategy Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2023-05-15 Emmanuel Jeannot, Guillaume Pallez, Nicolas Vidal
In high performance, computing concurrent applications are sharing the same file system. However, the bandwidth which provides access to the storage is limited. Therefore, too many I/O operations p...
-
A study on the performance of distributed training of data-driven CFD simulations Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2023-05-04 Sergio Iserte, Alejandro González-Barberá, Paloma Barreda, Krzysztof Rojek
Data-driven methods for computer simulations are blooming in many scientific areas. The traditional approach to simulating physical behaviors relies on solving partial differential equations (PDEs)...
-
Myths and legends in high-performance computing Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2023-04-24 Satoshi Matsuoka, Jens Domke, Mohamed Wahib, Aleksandr Drozd, Torsten Hoefler
In this thought-provoking article, we discuss certain myths and legends that are folklore among members of the high-performance computing community. We gathered these myths from conversations at co...
-
Orchestration of materials science workflows for heterogeneous resources at large scale Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2023-04-14 Naweiluo Zhou, Giorgio Scorzelli, Jakob Luettgau, Rahul R Kancharla, Joshua J Kane, Robert Wheeler, Brendan P Croom, Pania Newell, Valerio Pascucci, Michela Taufer
In the era of big data, materials science workflows need to handle large-scale data distribution, storage, and computation. Any of these areas can become a performance bottleneck. We present a fram...
-
Versatile software-defined HPC and cloud clusters on Alps supercomputer for diverse workflows Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2023-04-11 Sadaf R Alam, Miguel Gila, Mark Klein, Maxime Martinasso, Thomas C Schulthess
Supercomputers have been driving innovations for performance and scaling benefiting several scientific applications for the past few decades. Yet their ecosystems remain virtually unchanged when it...
-
A Survey of Graph Comparison Methods with Applications to Nondeterminism in High-Performance Computing Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2023-04-05 Sanjukta Bhowmick, Patrick Bell, Michela Taufer
The convergence of extremely high levels of hardware concurrency and the effective overlap of computation and communication in asynchronous executions has resulted in increasing nondeterminism in H...
-
Automatizing the creation of specialized high-performance computing containers Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2023-03-29 Jorge Ejarque, Rosa M Badia
With Exascale computing already here, supercomputers are systems every time larger, more complex, and heterogeneous. While expert system administrators can install and deploy applications in the sy...
-
Combining multitask and transfer learning with deep Gaussian processes for autotuning-based performance engineering Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2023-03-30 Piotr Luszczek, Wissam M Sid-Lakhdar, Jack Dongarra
We combine deep Gaussian processes (DGPs) with multitask and transfer learning for the performance modeling and optimization of HPC applications. Deep Gaussian processes merge the uncertainty quant...
-
Accelerating cluster dynamics simulation of fission gas behavior in nuclear fuel on deep computing unit–based heterogeneous architecture supercomputer Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2023-03-14 He Bai, Changjun Hu, Yuhan Zhu, Dandan Chen, Genshen Chu, Shuai Ren
High fidelity simulation of fission gas behavior is able to help us understand and predict the performance of nuclear fuel under different irradiation conditions. Cluster dynamics (CD) is a mesosca...
-
Experiences with nested parallelism in task-parallel applications using malleable BLAS on multicore processors Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2023-03-10 Rafael Rodríguez-Sánchez, Adrián Castelló, Sandra Catalán, Francisco D. Igual, Enrique S. Quintana-Ortí
Malleability is defined as the ability to vary the degree of parallelism at runtime, and is regarded as a means to improve core occupation on state-of-the-art multicore processors tshat contain ten...
-
Large-Scale direct numerical simulations of turbulence using GPUs and modern Fortran Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2023-02-22 Martin Karp, Daniele Massaro, Niclas Jansson, Alistair Hart, Jacob Wahlgren, Philipp Schlatter, Stefano Markidis
We present our approach to making direct numerical simulations of turbulence with applications in sustainable shipping. We use modern Fortran and the spectral element method to leverage and scale o...
-
Mixed precision LU factorization on GPU tensor cores: reducing data movement and memory footprint Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2023-01-03 Florent Lopez, Theo Mary
Modern GPUs equipped with mixed precision tensor core units present great potential to accelerate dense linear algebra operations such as LU factorization. However, state-of-the-art mixed half/sing...
-
Semi-Lagrangian 4d, 5d, and 6d kinetic plasma simulation on large-scale GPU-equipped supercomputers Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-12-16 Lukas Einkemmer, Alexander Moriggl
Running kinetic plasma physics simulations using grid-based solvers is very demanding both in terms of memory as well as computational cost. This is primarily due to the up to six-dimensional phase...
-
Parthenon—a performance portable block-structured adaptive mesh refinement framework Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-12-13 Philipp Grete, Joshua C Dolence, Jonah M Miller, Joshua Brown, Ben Ryan, Andrew Gaspar, Forrest Glines, Sriram Swaminarayan, Jonas Lippuner, Clell J Solomon, Galen Shipman, Christoph Junghans, Daniel Holladay, James M Stone, Luke F Roberts
On the path to exascale the landscape of computer device architectures and corresponding programming models has become much more diverse. While various low-level performance portable programming mo...
-
Performance comparison of the A-grid and C-grid shallow-water models on icosahedral grids Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-11-15 Jacques Middlecoff, Yonggang G Yu, Mark W Govett
This study uses a single software framework to compare the CPU performance of Arakawa A-grid (NICAM) and C-grid (MPAS) schemes for solving the shallow-water equations on icosahedral grids. The focu...
-
Acceleration of a parallel BDDC solver by using graphics processing units on subdomains Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-11-05 Jakub Šístek, Tomáš Oberhuber
An approach to accelerating a parallel domain decomposition (DD) solver by graphics processing units (GPUs) is investigated. The solver is based on the Balancing Domain Decomposition Method by Cons...
-
Data-driven scalable pipeline using national agent-based models for real-time pandemic response and decision support Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-10-20 Parantapa Bhattacharya, Jiangzhuo Chen, Stefan Hoops, Dustin Machi, Bryan Lewis, Srinivasan Venkatramanan, Mandy L. Wilson, Brian Klahn, Aniruddha Adiga, Benjamin Hurt, Joseph Outten, Abhijin Adiga, Andrew Warren, Young Yun Baek, Przemyslaw Porebski, Achla Marathe, Dawen Xie, Samarth Swarup, Anil Vullikanti, Henning Mortveit, Stephen Eubank, Christopher L. Barrett, Madhav Marathe
This paper describes an integrated, data-driven operational pipeline based on national agent-based models to support federal and state-level pandemic planning and response. The pipeline consists of...
-
Language models for the prediction of SARS-CoV-2 inhibitors Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-10-07 Andrew E Blanchard, John Gounley, Debsindhu Bhowmik, Mayanka Chandra Shekar, Isaac Lyngaas, Shang Gao, Junqi Yin, Aristeidis Tsaris, Feiyi Wang, Jens Glaser
The COVID-19 pandemic highlights the need for computational tools to automate and accelerate drug design for novel protein targets. We leverage deep learning language models to generate and score d...
-
Digital transformation of droplet/aerosol infection risk assessment realized on “Fugaku” for the fight against COVID-19 Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-10-07 Kazuto Ando, Rahul Bale, ChungGang Li, Satoshi Matsuoka, Keiji Onishi, Makoto Tsubokura
The fastest supercomputer in 2020, Fugaku, has not only achieved digital transformation of epidemiology in allowing end-to-end, detailed quantitative modeling of COVID-19 transmissions for the firs...
-
Exploiting temporal data reuse and asynchrony in the reverse time migration Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-10-03 Long Qu, Rached Abdelkhalak, Hatem Ltaief, Issam Said, David Keyes
Reverse Time Migration (RTM) is a state-of-the-art algorithm used in seismic depth imaging in complex geological environments for the oil and gas exploration industry. It calculates high-resolution...
-
#COVIDisAirborne: AI-enabled multiscale computational microscopy of delta SARS-CoV-2 in a respiratory aerosol Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-10-02 Abigail Dommer, Lorenzo Casalino, Fiona Kearns, Mia Rosenfeld, Nicholas Wauer, Surl-Hee Ahn, John Russo, Sofia Oliveira, Clare Morris, Anthony Bogetti, Anda Trifan, Alexander Brace, Terra Sztain, Austin Clyde, Heng Ma, Chakra Chennubhotla, Hyungro Lee, Matteo Turilli, Syma Khalid, Teresa Tamayo-Mendoza, Matthew Welborn, Anders Christensen, Daniel GA Smith, Zhuoran Qiao, Sai K Sirumalla, Michael O’Connor
We seek to completely revise current models of airborne transmission of respiratory viruses by providing never-before-seen atomic-level views of the SARS-CoV-2 virus within a respiratory aerosol. O...
-
PeleC: An adaptive mesh refinement solver for compressible reacting flows Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-09-06 Marc T Henry de Frahan, Jon S Rood, Marc S Day, Hariswaran Sitaraman, Shashank Yellapantula, Bruce A Perry, Ray W Grout, Ann Almgren, Weiqun Zhang, John B Bell, Jacqueline H Chen
Reacting flow simulations for combustion applications require extensive computing capabilities. Leveraging the AMReX library, the Pele suite of combustion simulation tools targets the largest super...
-
Enabling efficient execution of a variational data assimilation application Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-08-28 John M Dennis, Allison H Baker, Brian Dobbins, Michael M Bell, Jian Sun, Youngsung Kim, Ting-Yu Cha
Remote sensing observational instruments are critical for better understanding and predicting severe weather. Observational data from such instruments, such as Doppler radar data, for example, are ...
-
Free energy perturbation–based large-scale virtual screening for effective drug discovery against COVID-19 Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-08-22 Zhe Li, Chengkun Wu, Yishui Li, Runduo Liu, Kai Lu, Ruibo Wang, Jie Liu, Chunye Gong, Canqun Yang, Xin Wang, Chang-Guo Zhan, Hai-Bin Luo
As a theoretically rigorous and accurate method, FEP-ABFE (Free Energy Perturbation-Absolute Binding Free Energy) calculations showed great potential in drug discovery, but its practical applicatio...
-
Intelligent resolution: Integrating Cryo-EM with AI-driven multi-resolution simulations to observe the severe acute respiratory syndrome coronavirus-2 replication-transcription machinery in action Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-08-05 Anda Trifan, Defne Gorgun, Michael Salim, Zongyi Li, Alexander Brace, Maxim Zvyagin, Heng Ma, Austin Clyde, David Clark, David J Hardy, Tom Burnley, Lei Huang, John McCalpin, Murali Emani, Hyenseung Yoo, Junqi Yin, Aristeidis Tsaris, Vishal Subbiah, Tanveer Raza, Jessica Liu, Noah Trebesch, Geoffrey Wells, Venkatesh Mysore, Thomas Gibbs, James Phillips, S Chakra Chennubhotla, Ian Foster, Rick Stevens
The severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) replication transcription complex (RTC) is a multi-domain protein responsible for replicating and transcribing the viral mRNA inside...
-
Compressed basis GMRES on high-performance graphics processing units Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-08-05 José I Aliaga, Hartwig Anzt, Thomas Grützmacher, Enrique S Quintana-Ortí, Andrés E Tomás
Krylov methods provide a fast and highly parallel numerical tool for the iterative solution of many large-scale sparse linear systems. To a large extent, the performance of practical realizations o...
-
Enhancing data locality of the conjugate gradient method for high-order matrix-free finite-element implementations Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-07-07 Martin Kronbichler, Dmytro Sashko, Peter Munch
This work investigates a variant of the conjugate gradient (CG) method and embeds it into the context of high-order finite-element schemes with fast matrix-free operator evaluation and cheap precon...
-
An elastic framework for ensemble-based large-scale data assimilation Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-06-28 Sebastian Friedemann, Bruno Raffin
Prediction of chaotic systems relies on a floating fusion of sensor data (observations) with a numerical model to decide on a good system trajectory and to compensate non-linear feedback effects. E...
-
Corrigendum to ‘Unprecedented cloud resolution in a GPU-enabled full-physics atmospheric climate simulation on OLCF’s summit supercomputer’ Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-06-01
Matt Norman. et al. (2022) Unprecedented cloud resolution in a GPU-enabled full-physics atmospheric climate simulation on OLCF’s summit supercomputer, The International Journal of High Performance Computing Applications, 36(1): 93–105. DOI: 10.1177/10943420211027539
-
A fine-grained parallelization of the immersed boundary method Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-06-03 Andrew Kassen, Varun Shankar, Aaron L Fogelson
We present new algorithms for the parallelization of Eulerian–Lagrangian interaction operations in the immersed boundary method. Our algorithms rely on two well-studied parallel primitives: key-val...
-
Recovering single precision accuracy from Tensor Cores while surpassing the FP32 theoretical peak performance Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-06-03 Hiroyuki Ootomo, Rio Yokota
Tensor Core is a mixed-precision matrix–matrix multiplication unit on NVIDIA GPUs with a theoretical peak performance of more than 300 TFlop/s on Ampere architectures. Tensor Cores were developed i...
-
Accelerating physics simulations with tensor processing units: An inundation modeling example Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-06-03 R Lily Hu, Damien Pierce, Yusef Shafi, Anudhyan Boral, Vladimir Anisimov, Sella Nevo, Yi-fan Chen
Recent advancements in hardware accelerators such as Tensor Processing Units (TPUs) speed up computation time relative to Central Processing Units (CPUs) not only for machine learning but, as demon...
-
Matrix-free approaches for GPU acceleration of a high-order finite element hydrodynamics application using MFEM, Umpire, and RAJA Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-05-25 Arturo Vargas, Thomas M Stitt, Kenneth Weiss, Vladimir Z Tomov, Jean-Sylvain Camier, Tzanio Kolev, Robert N Rieben
With the introduction of advanced heterogeneous computing architectures based on GPU accelerators, large-scale production codes have had to rethink their numerical algorithms and incorporate new programming models and memory management strategies in order to run efficiently on the latest supercomputers. In this work we discuss our co-design strategy to address these challenges and achieve performance
-
Performance analysis of relaxation Runge–Kutta methods Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-05-12 Marcin Rogowski, Lisandro Dalcin, Matteo Parsani, David E Keyes
Recently, global and local relaxation Runge–Kutta methods have been developed for guaranteeing the conservation, dissipation, or other solution properties for general convex functionals whose dynamics are crucial for an ordinary differential equation solution. These novel time integration procedures have an application in a wide range of problems that require dynamics-consistent and stable numerical
-
Very fast finite element Poisson solvers on lower precision accelerator hardware: A proof of concept study for Nvidia Tesla V100 Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-05-06 Dustin Ruda, Stefan Turek, Dirk Ribbrock, Peter Zajac
Recently, accelerator hardware in the form of graphics cards including Tensor Cores, specialized for AI, has significantly gained importance in the domain of high-performance computing. For example, NVIDIA’s Tesla V100 promises a computing power of up to 125 TFLOP/s achieved by Tensor Cores, but only if half precision floating point format is used. We describe the difficulties and discrepancy between
-
Task-parallel in situ temporal compression of large-scale computational fluid dynamics data Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-04-21 Heather Pacella, Alec Dunton, Alireza Doostan, Gianluca Iaccarino
Present day computational fluid dynamics (CFD) simulations generate considerable amounts of data, sometimes on the order of TB/s. Often, a significant fraction of this data is discarded because current storage systems are unable to keep pace. To address this, data compression algorithms can be applied to data arrays containing flow quantities of interest (QoIs) to reduce the overall required storage
-
AI4IO: A suite of AI-based tools for IO-aware scheduling Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-04-03 Michael R Wyatt, II, Stephen Herbein, Todd Gamblin, Michela Taufer
Traditional workload managers do not have the capacity to consider how IO contention can increase job runtime and even cause entire resource allocations to be wasted. Whether from bursts of IO demand or parallel file systems (PFS) performance degradation, IO contention must be identified and addressed to ensure maximum performance. In this paper, we present AI4IO (AI for IO), a suite of tools using
-
An analytical performance model of generalized hierarchical scheduling Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-03-26 Stephen Herbein, Tapasya Patki, Dong H Ahn, Sebastian Mobo, Clark Hathaway, Silvina Caíno-Lores, James Corbett, David Domyancic, Thomas RW Scogland, Bronis R de Supinski, Michela Taufer
High performance computing (HPC) workflows are undergoing tumultuous changes, including an explosion in size and complexity. Despite these changes, most batch job systems still use slow, centralized schedulers. Generalized hierarchical scheduling (GHS) solves many of the challenges that face modern workflows, but GHS has not been widely adopted in HPC. A major difficulty that hinders adoption is the
-
Productively accelerating positron emission tomography image reconstruction on graphics processing units with Julia Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-03-22 Michiel Van Gendt, Tim Besard, Stefaan Vandenberghe, Bjorn De Sutter
Research in medical imaging is hampered by a lack of programming languages that support productive, flexible programming as well as high performance. In search for higher quality imaging, researchers can ideally experiment with novel algorithms using rapid-prototyping languages such as Python. However, to speed up image reconstruction, computational resources such as those of graphics processing units
-
Development of NCL equivalent serial and parallel python routines for meteorological data analysis Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-03-22 Jatin Gharat, Bipin Kumar, Leena Ragha, Amit Barve, Shaik Mohammad Jeelani, John Clyne
The NCAR Command Language (NCL) is a popular scripting language used in the geoscience community for weather data analysis and visualization. Hundreds of years of data are analyzed daily using NCL to make accurate weather predictions. However, due to its sequential nature of execution, it cannot properly utilize the parallel processing power provided by High-Performance Computing systems (HPCs). Until
-
Performance portability in a real world application: PHAST applied to Caffe Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-03-21 Pablo Antonio Martínez, Biagio Peccerillo, Sandro Bartolini, José M García, Gregorio Bernabé
This work covers the PHAST Library’s employment, a hardware-agnostic programming library, to a real-world application like the Caffe framework. The original implementation of Caffe consists of two different versions of the source code: one to run on CPU platforms and another one to run on the GPU side. With PHAST, we aim to develop a single-source code implementation capable of running efficiently
-
Efficient high-precision integer multiplication on the GPU Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-03-20 Adrian P Dieguez, Margarita Amor, Ramón Doallo, Akira Nukada, Satoshi Matsuoka
The multiplication of large integers, which has many applications in computer science, is an operation that can be expressed as a polynomial multiplication followed by a carry normalization. This work develops two approaches for efficient polynomial multiplication: one approach is based on tiling the classical convolution algorithm, but taking advantage of new CUDA architectures, a novelty approach
-
Enhancing scalability of a matrix-free eigensolver for studying many-body localization Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-03-19 Roel Van Beeumen, Khaled Z. Ibrahim, Gregory D. Kahanamoku–Meyer, Norman Y. Yao, Chao Yang
We propose several techniques to enhance the parallel scalability of a matrix-free eigensolver designed for studying many-body localization (MBL) of quantum spin chain models with nearest-neighbor interactions and on-site disorder. This type of problem is computationally challenging because the dimension of the associated Hamiltonian matrix grows exponentially with respect to the number of spins L
-
A massively parallel time-domain coupled electrodynamics–micromagnetics solver Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-01-15 Zhi Yao, Revathi Jambunathan, Yadong Zeng, Andrew Nonaka
We present a high-performance coupled electrodynamics–micromagnetics solver for full physical modeling of signals in microelectronic circuitry. The overall strategy couples a finite-difference time-domain approach for Maxwell’s equations to a magnetization model described by the Landau–Lifshitz–Gilbert equation. The algorithm is implemented in the Exascale Computing Project software framework, AMReX
-
ExaAM: Metal additive manufacturing simulation at the fidelity of the microstructure Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-01-10 John A Turner, James Belak, Nathan Barton, Matthew Bement, Neil Carlson, Robert Carson, Stephen DeWitt, Jean-Luc Fattebert, Neil Hodge, Zechariah Jibben, Wayne King, Lyle Levine, Christopher Newman, Alex Plotkowski, Balasubramaniam Radhakrishnan, Samuel Temple Reeve, Matthew Rolchigo, Adrian Sabau, Stuart Slattery, Benjamin Stump
Additive manufacturing (AM), or 3D printing, of metals is transforming the fabrication of components, in part by dramatically expanding the design space, allowing optimization of shape and topology. However, although the physical processes involved in AM are similar to those of welding, a field with decades of experimental, modeling, simulation, and characterization experience, qualification of AM
-
Large-scale ab initio simulation of light–matter interaction at the atomic scale in Fugaku Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2022-01-02 Yuta Hirokawa, Atsushi Yamada, Shunsuke Yamada, Masashi Noda, Mitsuharu Uemoto, Taisuke Boku, Kazuhiro Yabana
In the field of optical science, it is becoming increasingly important to observe and manipulate matter at the atomic scale using ultrashort pulsed light. For the first time, we have performed the ab initio simulation solving the Maxwell equation for light electromagnetic fields, the time-dependent Kohn-Sham equation for electrons, and the Newton equation for ions in extended systems. In the simulation
-
Development of a hardware-accelerated simulation kernel for ultra-high vacuum with Nvidia RTX GPUs Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2021-12-11 Pascal R Bähr, Bruno Lang, Peer Ueberholz, Marton Ady, Roberto Kersevan
Molflow+ is a Monte Carlo (MC) simulation software for ultra-high vacuum, mainly used to simulate pressure in particle accelerators. In this article, we present and discuss the design choices arising in a new implementation of its ray-tracing–based simulation unit for Nvidia RTX Graphics Processing Units (GPUs). The GPU simulation kernel was designed with Nvidia’s OptiX 7 API to make use of modern
-
Resiliency in numerical algorithm design for extreme scale simulations Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2021-12-10 Emmanuel Agullo, Mirco Altenbernd, Hartwig Anzt, Leonardo Bautista-Gomez, Tommaso Benacchio, Luca Bonaventura, Hans-Joachim Bungartz, Sanjay Chatterjee, Florina M Ciorba, Nathan DeBardeleben, Daniel Drzisga, Sebastian Eibl, Christian Engelmann, Wilfried N Gansterer, Luc Giraud, Dominik Göddeke, Marco Heisig, Fabienne Jézéquel, Nils Kohl, Xiaoye Sherry Li, Romain Lion, Miriam Mehl, Paul Mycek, Michael
This work is based on the seminar titled ‘Resiliency in Numerical Algorithm Design for Extreme Scale Simulations’ held March 1–6, 2020, at Schloss Dagstuhl, that was attended by all the authors. Advanced supercomputing is characterized by very high computation speeds at the cost of involving an enormous amount of resources and costs. A typical large-scale computation running for 48 h on a system consuming
-
Co-design in the Exascale Computing Project Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2021-11-07 Timothy C Germann
We provide an overview of the six co-design centers within the U.S. Department of Energy’s Exascale Computing Project, each of which is described in more detail in a separate paper in this special issue. We also give a perspective on the evolution of computational co-design.
-
SAM++: Porting the E3SM-MMF cloud resolving model using a C++ portability library Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2021-10-10 Isaac Lyngaas, Matt Norman, Youngsung Kim
In this work, we demonstrate the process for porting the cloud resolving model (CRM) used in the Energy Exascale Earth System Model Multi-Scale Modeling Framework (E3SM-MMF) from its original Fortran code base to C++ code using a portability library. This porting process is performed using the Yet Another Kernel Library (YAKL), a simplified C++ portability library that specializes in Fortran porting
-
Highly scalable numerical simulation of coupled reaction–Diffusion systems with moving interfaces Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2021-10-10 Mojtaba Barzegari, Liesbet Geris
A combination of reaction–diffusion models with moving-boundary problems yields a system in which the diffusion (spreading and penetration) and reaction (transformation) evolve the system’s state and geometry over time. These systems can be used in a wide range of engineering applications. In this study, as an example of such a system, the degradation of metallic materials is investigated. A mathematical
-
EXAGRAPH: Graph and combinatorial methods for enabling exascale applications Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2021-09-30 Seher Acer, Ariful Azad, Erik G Boman, Aydın Buluç, Karen D. Devine, SM Ferdous, Nitin Gawande, Sayan Ghosh, Mahantesh Halappanavar, Ananth Kalyanaraman, Arif Khan, Marco Minutoli, Alex Pothen, Sivasankaran Rajamanickam, Oguz Selvitopi, Nathan R Tallent, Antonino Tumeo
Combinatorial algorithms in general and graph algorithms in particular play a critical enabling role in numerous scientific applications. However, the irregular memory access nature of these algorithms makes them one of the hardest algorithmic kernels to implement on parallel systems. With tens of billions of hardware threads and deep memory hierarchies, the exascale computing systems in particular
-
Co-design Center for Exascale Machine Learning Technologies (ExaLearn) Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2021-09-27 Francis J Alexander, James Ang, Jenna A Bilbrey, Jan Balewski, Tiernan Casey, Ryan Chard, Jong Choi, Sutanay Choudhury, Bert Debusschere, Anthony M DeGennaro, Nikoli Dryden, J Austin Ellis, Ian Foster, Cristina Garcia Cardona, Sayan Ghosh, Peter Harrington, Yunzhi Huang, Shantenu Jha, Travis Johnston, Ai Kagawa, Ramakrishnan Kannan, Neeraj Kumar, Zhengchun Liu, Naoya Maruyama, Satoshi Matsuoka, Erin
Rapid growth in data, computational methods, and computing power is driving a remarkable revolution in what variously is termed machine learning (ML), statistical learning, computational learning, and artificial intelligence. In addition to highly visible successes in machine-based natural language translation, playing the game Go, and self-driving cars, these new technologies also have profound implications
-
A population data-driven workflow for COVID-19 modeling and learning Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2021-09-10 Jonathan Ozik, Justin M Wozniak, Nicholson Collier, Charles M Macal, Mickaël Binois
CityCOVID is a detailed agent-based model that represents the behaviors and social interactions of 2.7 million residents of Chicago as they move between and colocate in 1.2 million distinct places, including households, schools, workplaces, and hospitals, as determined by individual hourly activity schedules and dynamic behaviors such as isolating because of symptom onset. Disease progression dynamics
-
Special Issue Introduction: The Gordon Bell Special Prize for HPC-Based COVID-19 Research Finalists Int. J. High Perform. Comput. Appl. (IF 3.1) Pub Date : 2021-09-01 Bronis R. de Supinski