Direct numerical simulations of turbulent reacting flows with shock waves and stiff chemistry using many-core/GPU acceleration

doi:10.1016/j.compfluid.2020.104787

Computers & Fluids

Volume 215, 30 January 2021, 104787

https://doi.org/10.1016/j.compfluid.2020.104787 Get rights and content

Highlights

•
Performance portability of a new reacting flow DNS solver, KARFS, is demonstrated.
•
Effects of different metrics on elucidating performance is highlighted.
•
DNS of detonation with stiff chemistry is studied under non-trivial conditions.
•
Potential of using WENO schemes in reacting flow DNS without shocks is presented.

Abstract

Compressible reacting flows may display sharp spatial variation related to shocks, contact discontinuities or reactive zones embedded within relatively smooth regions. The presence of such phenomena emphasizes the relevance of shock-capturing schemes such as the weighted essentially non-oscillatory (WENO) scheme as an essential ingredient of the numerical solver. However, these schemes are complex and have more computational cost than the simple high-order compact or non-compact schemes. In this paper, we present the implementation of a seventh-order, minimally-dissipative mapped WENO (WENO7M) scheme in a newly developed direct numerical simulation (DNS) code called KAUST Adaptive Reactive Flows Solver (KARFS). In order to make efficient use of the computer resources and reduce the solution time, without compromising the resolution requirement, the WENO routines are accelerated via graphics processing unit (GPU) computation. The performance characteristics and scalability of the code are studied using different grid sizes and block decomposition. The performance portability of KARFS is demonstrated on a variety of architectures including NVIDIA Tesla P100 GPUs and NVIDIA Kepler K20X GPUs. In addition, the capability and potential of the newly implemented WENO7M scheme in KARFS to perform DNS of compressible flows is also demonstrated with model problems involving shocks, isotropic turbulence, detonations and flame propagation into a stratified mixture with complex chemical kinetics.

Introduction

The pursuit of improved efficiency and lower emissions in internal combustion engines has been an ongoing quest for several decades and continues to attract significant attention due to its economic and environmental impact. Recent emphasis on reducing the harmful nitrogen oxide and greenhouse gases emissions has renewed the focus on developing better internal combustion (IC) technologies that can derive the full economic potential of fossil fuels for transportation applications while lowering emissions. Several advanced combustion concepts have emerged which hold the promise of simultaneously improving the efficiency and lowering the emissions of IC engines, while also allowing the utilization of renewable fuels [1], [2], [3]. The combustion concepts that are being pursued are quite different from the traditional spark ignition (SI) or compression ignition (CI) based engines. Ignition and combustion are expected to occur at conditions that are in the fringes of the traditional operating regimes of IC engines with limited prior understanding and experience. Furthermore, mixed modes of combustion have been observed to occur in which propagation of reaction fronts and uniform volumetric combustion exist simultaneously ([4], [5], [6], [7], [8], [9], [10] and references therein). There is currently only a limited fundamental understanding of the nature of combustion in an auto-ignitive mixture that is conducive to both spontaneous ignition and diffusion driven flame propagation. Several previous experimental and numerical studies have focused on characterizing the nature of combustion and delineating it into multiple regimes based on the propagation speed of the autoignition front and also based on the impact of turbulence and stratification on such a front [11], [12], [13], [14], [15], [16], [17], [18], [19], [20], [21]. Fundamental numerical simulations that can capture the nature and characteristics of auto-ignition in stratified reactive mixtures are key to improving and applying the appropriate combustion models in device-scale simulations. As such, modeling and simulation will continue to play a very important role in improving the fundamental understanding of the combustion characteristics at such conditions, and in advancing the combustion technologies for practical applications.

Direct numerical simulation (DNS) of turbulent reacting flows is a first principle approach for turbulent combustion simulations that does not require closure models for turbulence physics and therefore free of underlying statistical and numerical assumptions [22]. DNS of reacting flows in canonical flow configurations such as triply periodic cube domains, quasi-one dimensional planar stabilized and fully three-dimensional spatially developing jet configurations have been used to study turbulence-chemistry interaction in reactive mixtures [4], [5], [23]. Since DNS resolves all of the relevant flow scales up to the smallest dissipative scales, it becomes prohibitively expensive to simulate large scale flow motion up to device scales. However, smaller canonical geometries have been within the reach of DNS for several years now. Newer developments in high performance computing (HPC) have allowed an increase in the range of length scales simulated, providing more realism in the turbulence physics, along with simultaneously increasing the chemical complexity, allowing the simulation of fuels and intermediates that are better surrogates of transportation fuels.

One of the primary objectives of DNS is to capture the turbulent flow physics as accurately as possible without introducing numerical distortion that would cause the turbulence energy cascade to be incorrect. Low dissipation by employing higher order accuracy have become essential for the fidelity of simulations. Higher order centered finite difference schemes [24] along with higher order accurate Runge–Kutta time integration schemes [25] have been very successful in enabling DNS with higher solution accuracy while also allowing efficient and affordable computations of large problems. Since the central difference scheme lacks the numerical dissipation needed to ensure stability, it is also customary to use a filter operator, such as the tenth order accurate explicit filter by Kennedy and Carpenter [24]. Such a combination of higher order accurate central difference schemes along with explicit filtering has proven to be well suited and sufficient for DNS of low speed turbulent combustion when the combustion fronts are sufficiently resolved on the numerical grid.

Despite their advantages, centered finite difference schemes are not suitable for performing DNS of supersonic combustion. Under normal conditions, the spatial profile of a propagating reaction front is determined by the transport rate of heat and radicals between the reaction zone and the upstream mixture. The thickness of a typical transport-driven flame front is determined by the molecular or turbulent diffusivity, in the case of laminar and turbulent flames, respectively. However, a spontaneous propagation front under auto-ignitive conditions is not limited by the transport rates. The thickness of a spontaneous propagation front is determined primarily by the gradients in reactivity of the upstream mixture and the effects of transport rate are secondary. When a spontaneously propagating front encounters an adverse gradient in reactivity, it can turn into a sharp contact discontinuity between the unreacted and reacted mixtures, with steep gradients in density, temperature and composition. Such a contact discontinuity is distinct from a detonation, since there is no large pressure jump between the unburned and burned mixtures. The contact discontinuity can only be resolved by the central difference schemes seldomly and often requires the use of a numerical scheme capable of capturing discontinuities. Similarly, a detonation may occur when a spontaneously propagating front acheives resonance with the accompanying pressurc wave, thereby rendering the standard centered finite difference schemes ineffective in capturing the underlying physics.

Weighted essentially non-oscillatory (WENO) schemes [26] are a class of numerical methods that have been shown to be capable of capturing discontinuities in compressible flows, while also being suitable for DNS. WENO schemes obtain a numerical approximation to the advective fluxes in the governing equation using a weighted sum of fluxes computed using multiple candidate stencils. The weights are determined by adapting to the presence of discontinuities such that the solution tends to approach high accuracy and low dissipation in smooth regions of the flow. WENO schemes are considerably more expensive than regular central difference schemes due to the need to evaluate the same flux functions from multiple candidate stencils [27]. However, the numerical method is suitable for implementation in finite difference codes using stencil operations very similar to the central difference schemes. Also, the usual advantages of DNS using central difference schemes such as parallelism and scalability carries over to the WENO schemes as well. Apart from DNS, these advantages of WENO schemes have also been demonstrated in implicit large eddy simulations (iLES) [28], [29], [30].

Success in the general application of computational fluid dynamics (CFD) and affiliated techniques such as DNS is largely dependent on achieving an optimal balance between multi-scale fidelity, multi-physics detail and the overall computational cost. In this regard, recent advances in computational architectures offer both potential and challenges. Dramatic speed-ups can be potentially achieved across platforms using state-of-the-art multi-core central processing units (CPUs) and graphics processing units (GPUs). The challenge, however, is actually attaining this potential through development of new parallel programming models that can be used to efficiently port various algorithms and the related code kernels to these new heterogeneous architectures. A major challenge in utilizing heterogeneous resources is the diversity of devices on different machines, which provide widely varying performance characteristics. A program or algorithm optimized for one architecture may not run as well on the next generation of processors or on a device from a different vendor. A program or algorithm optimized for GPU execution is often very different from the one optimized for CPU execution. The relative performance of CPUs and GPUs also varies between machines. On one machine, a specific portion of a computation may run best on the CPU, while on another machine it may run best on the GPU. In some cases, it is best to balance the workload between the CPU and the GPU; in other cases, it may be best to execute an algorithm on a device where it runs more slowly but closer to where its output is needed, in order to avoid expensive data transfer operations. It is not uncommon for large application codes to have several different implementations, with each one optimized for a different architecture. This software development approach leads to a code maintainability issue: every new change to the code needs to be implemented in all versions of the code. Performance portability of a single code base, therefore, has become a critical issue: the parallel code needs to execute correctly and remain performant on machines with different architectures, operating systems, and software libraries.

Several groups have published their efforts for accelerating the flow solvers used for multi-component reacting flow calculations using GPUs. The most common approach is to select the most time consuming and compute-intensive kernels and offload them to the GPUs for accelerated solution [31]. In the case of combustion solvers, the chemical reaction kinetics remains the predominant kernel which is routinely offloaded to the GPUs. Spafford and co-workers [32] were the first to use this approach to offload the chemical kinetics evaluation to a GPU using the CUDA programming model and thereby accelerate a DNS solver for turbulent combustion. Their approach was to use grid-level parallelism for acceleration by computing the reaction kinetics across the large number of grid points in parallel on the GPU. In contrast, Shi et al. [33], utilized the parallelism available in the reaction network itself to simultaneously calculate all the reaction rates for a single kinetic system. They observed a considerable speedup for large chemical reaction mechanisms ( $> 1000$ species), but could not obtain any acceleration for smaller mechanisms ( $< 100$ species). The performance of implicit integration schemes for stiff chemical kinetics against the speedups obtained for explicit integration has also been measured by Stone et al. [34]. More recently, Niemeyer et al. [35] used fully explicit and stabilized explicit methods for accelerating chemical kinetics with low to moderate levels of stiffness. In these studies, it was found that the potential for acceleration of chemical kinetics through GPUs is large when explicit integration is feasible. But in the presence of stiffness, the performance on GPUs was highly susceptible to thread divergence due to varying levels of stiffness across the multiple states.

We introduced a new DNS solver - KARFS (KAUST Adaptive Reacting Flows Solver) for combustion calculations in a previous research article [36], showcasing its application in multidimensional turbulent reacting flow simulations. KARFS is a modern DNS code developed in C++ using modern parallel programming patterns, distributed memory parallelism through message passing interface (MPI) and portable on-node parallelism with the Kokkos C++ programming model [37], [38]. KARFS has performance-portable capabilities for multi- and many-core heterogeneous platforms, which is instrumental to meet emerging demands of exascale computing machines. The employed Kokkos framework allows KARFS to straighforwardly utilize the same source code on CPUs and GPUs while maintaining performance comparable to the code that is optimally tuned to either processing unit. We have taken the approach to use standard C++ programming practices for both on- and off-node parallelism. This has been the main motivation for choosing the Kokkos programming model, in which all kernel launches on a target device and all memory copies between memory spaces are asynchronous by default. With further development of Kokkos to allow multiple stream/task parallelism, it will be possible to schedule multiple streams of tasks to hide latencies. We believe that this will be the most portable programming model for future HPC application. Current choices for asynchronous execution or data management requires vendor specific extensions to the standard programming languages. Furthermore, they require compiler extensions through Pragmas that are not widely supported by all open source compilers. Other available runtime systems cannot currently inter-operate with Kokkos-type programming abstractions. We do not find any of these approaches attractive and hence will track the task parallelism model being developed in Kokkos.

In this paper, the numerical methods available in KARFS have been extended to include a seventh-order mapped WENO (WENO7M) scheme [39] for capturing discontinuities and a stiff ordinary differential equation (ODE) solver for dealing with stiff and complex chemistry. The computational loops and kernels for the WENO scheme are written using Kokkos parallel patterns so as to provide performance portability. Similarly, the linear algebra computations related to the stiff-ODE integrator are accelerated by making use of the MAGMA library [40], which was designed for hybrid computing architectures. In a recent study [41], it was demonstrated that the standard seventh-order WENO scheme developed by Jiang and Shu [42] is less dissipative compared to the corresponding standard fifth-order WENO scheme. Moreover, the standard seventh-order WENO scheme cannot exactly guarantee seventh-order accuracy since the nonlinear weights do not attain optimal values near the critical points. Henrick et al. [39] suggested a mapping technique such that the formal order of accuracy is efficiently recovered near critical points. Subsequent simulations of pulsating one-dimensional detonations demonstrated true fifth order accuracy [43]. These are the main reasons for specifically choosing the WENO7M scheme. Note, however, that it is not the only scheme suitable for reacting flow DNS. Alternatives could also be pure WENO schemes with enhanced smoothness measures, such as the WENO-Z schemes [44] or schemes with symmetric candidate stencils and bandwidth optimized weights, such as the WENO-SYMBO schemes [45].

The main scope of the paper is threefold: (1) to present the implementation and validation of the newly implemented WENO7M scheme and the stiff ODE-solver in KARFS, (2) to investigate the performance characteristics of KARFS by adopting various types of parallelism strategies and (3) to demonstrate execution of KARFS on a variety of architectures including NVIDIA Tesla P100 GPUs and NVIDIA Kepler K20X GPUs without making any changes to the source code.

Section snippets

Governing equations

The DNS code KARFS [36] solves the fully compressible continuity, momentum, total energy, and species equations in conservative form with detailed chemistry for a mixture of ideal gases on structured, Cartesian grids $\frac{\partial ρ}{\partial t} = - \frac{\partial (ρ u_{i})}{\partial x_{i}},$ $\frac{\partial (ρ u_{i})}{\partial t} = - \frac{\partial (ρ u_{i} u_{j})}{\partial x_{j}} - \frac{\partial P}{\partial x_{i}} + \frac{\partial τ_{i j}}{\partial x_{j}},$ $\frac{\partial (ρ e_{t})}{\partial t} = - \frac{\partial (ρ e_{t} u_{j})}{\partial x_{j}} - \frac{\partial (P u_{j})}{\partial x_{j}} + \frac{\partial (τ_{i j} u_{i})}{\partial x_{j}} - \frac{\partial q_{j}}{\partial x_{j}},$ $\frac{\partial (ρ Y_{k})}{\partial t} = - \frac{\partial (ρ Y_{k} u_{j})}{\partial x_{j}} - \frac{\partial J_{k, j}}{\partial x_{j}} + {\dot{ω}}_{k},$ where the Einstein summation convention is implied, $ρ$ is the density, $u_{i}$ is the Cartesian velocity component in the $i th$ coordinate direction ( $i = 1$

Numerical method

In its original formulation, KARFS [36] solves the compressible Navier–Stokes, species and energy equations fully explicitly employing a fourth-order, six-stage explicit Runge–Kutta operator for time integration [25]. To add robustness and performance gains when dealing with stiff and complex chemistry, an efficient stiff-ODE solver and a second-order operator splitting algorithm have also been implemented. Depending upon the nature of the problem, spatial discretization can be done using

Sod’s shock tube problem

The popular Sod shock tube problem [52] constitutes probably one of the most standard numerical benchmarks designed for compressible flow solvers. This test consists of a fluid, initially at rest, in which a virtual membrane located at the center of the domain separates two distinct sections: the one on the left at a higher density and pressure, and the one on the right at a lower density and pressure. The membrane is removed at $t = 0$ and a shock wave develops propagating toward the right,

Performance characteristics

In this section, we study the performance characteristics of KARFS and its scalability using different block sizes. The primary system used in this work is the Titan supercomputer at the Oak Ridge Leadership Computing Facility (OLCF). Titan is a hybrid architecture Cray XK7 system, consisting of 18,688 compute nodes that are interconnected by Cray’s Gemini network. Each node is composed of a 16-core AMD Opteron processor and an NVIDIA Tesla K20X graphics processing unit (GPU) as an accelerator.

Conclusions

Using the MPI + X programming model, the implementation and validation of a seventh-order, minimally dissipative, mapped WENO (WENO7M) scheme and an efficient operator-splitting method in KARFS are demonstrated. The MPI + X programming model relies on Kokkos for “X” for performance portability to multi-core, many-core and GPUs. The capability and potential of the newly implemented WENO7M scheme in KARFS to perform DNS of compressible flows is also demonstrated with model problems involving

CRediT authorship contribution statement

Swapnil Desai: Conceptualization, Methodology, Software, Writing - original draft, Data curation, Investigation, Formal analysis, Validation, Visualization. Yu Jeong Kim: Data curation, Investigation, Validation. Wonsik Song: Data curation, Investigation, Validation. Minh Bau Luong: . Francisco E. Hernández Pérez: Data curation, Investigation, Validation, Visualization, Formal analysis. Ramanan Sankaran: Writing - review & editing, Supervision, Resources. Hong G. Im: Writing - review & editing,

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was sponsored by competitive research funding from King Abdullah University of Science and Technology (KAUST). This research used resources of the computer clusters at KAUST Supercomputing Laboratory (KSL), the Oak Ridge Leadership Computing Facility and the Compute Data and Environment for Science (CADES) at ORNL, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725.

References (71)

M. Yao et al.
Progress and recent trends in homogeneous charge compression ignition (HCCI) engines
Prog Energy Combust Sci
(2009)
J.E. Dec
Advanced compression-ignition engines - understanding the in-cylinder processes
Proc Combust Inst
(2009)
A.K. Agarwal et al.
Evolution, challenges and path forward for low temperature combustion engines
Prog Energy Combust Sci
(2017)
A. Bhagatwala et al.
Numerical and experimental investigation of turbulent DME jet flames
Proc Combust Inst
(2015)
M.B. Luong et al.
Direct numerical simulations of the ignition of lean primary reference fuel/air mixtures with temperature inhomogeneities
Combust Flame
(2013)
M.B. Luong et al.
Direct numerical simulations of ignition of a lean $n$ -heptane/air mixture with temperature and composition inhomogeneities relevant to HCCI and SCCI combustion
Combust Flame
(2015)
M.B. Luong et al.
Ignition of a lean PRF/air mixture under RCCI/SCCI conditions: a comparative DNS study
Proc Combust Inst
(2017)
M.B. Luong et al.
Ignition of a lean PRF/air mixture under RCCI/SCCI conditions: chemical aspects
Proc Combust Inst
(2017)
M.B. Luong et al.
On the effect of injection timing on the ignition of lean PRF/air/EGR mixtures under direct dual fuel stratification conditions
Combust Flame
(2017)
D. Assanis et al.
An experimental study of flame and autoignition interactions of iso-octane and air mixtures
Combust Flame
(2015)

C. Strozzi et al.

Experimental analysis of propagation regimes during the autoignition of a fully premixed methane - air mixture in the presence of temperature inhomogeneities

Combust Flame

(2012)

X. Ma et al.

An optical study of in-cylinder CH $_{2}$ O and OH chemiluminescence in flame-induced reaction front propagation using high speed imaging

Fuel

(2014)

G. Bansal et al.

Autoignition and front propagation in low temperature combustion engine environments

Combust Flame

(2011)

S. Gupta et al.

Classification of ignition regimes in HCCI combustion using computational singular perturbation

Proc Combust Inst

(2011)

S. Gupta et al.

Analysis of n-heptane auto-ignition characteristics using computational singular perturbation

Proc Combust Inst

(2013)

P. Pal et al.

Computational characterization of ignition regimes in a syngas/air mixture with temperature fluctuations

Proc Combust Inst

(2017)

M.B. Luong et al.

Prediction of ignition modes of NTC-fuel/air mixtures with temperature and concentration fluctuations

Combust Flame

(2020)

G.H. Yu et al.

Ignition characteristics of a temporally evolving $n$ -heptane jet in an $i s o$ -octane/air stream under RCCI combustion-relevant conditions

Combust Flame

(2019)

S. Desai et al.

Auto-ignitive deflagration speed of methane (CH $_{4}$ ) blended dimethyl-ether (DME)/air mixtures at stratified conditions

Combust Flame

(2020)

R. Sankaran et al.

The effects of non-uniform temperature distribution on the ignition of a lean homogeneous hydrogen air mixture

Proc Combust Inst

(2005)

C.A. Kennedy et al.

Several new numerical methods for compressible shear-layer simulations

Appl Numer Math

(1994)

C.A. Kennedy et al.

Low-storage, explicit Runge–Kutta schemes for the compressible Navier–Stokes equations

Appl Numer Math

(2000)

X.-D. Liu et al.

Weighted essentially non-oscillatory schemes

J Comput Phys

(1994)

K. Ritos et al.

Physical insight into the accuracy of finely-resolved iLES in turbulent boundary layers

Comput Fluids

(2018)

K. Ritos et al.

Performance of high-order implicit large eddy simulations

Comput Fluids

(2018)

Y. Shi et al.

Redesigning combustion modeling algorithms for the graphics processing unit (GPU): chemical kinetic rate evaluation and ordinary differential equation integration

Combust Flame

(2011)

K.E. Niemeyer et al.

Accelerating moderately stiff chemical kinetics in reactive-flow simulations using GPUs

J Comput Phys

(2014)

F.E. Hernández Pérez et al.

Direct numerical simulations of reacting flows with detailed chemistry using many-core/GPU acceleration

Comput Fluids

(2018)

A.K. Henrick et al.

Mapped weighted essentially non-oscillatory schemes: achieving optimal order near critical points

J Comput Phys

(2005)

S. Tomov et al.

Towards dense linear algebra for hybrid GPU accelerated manycore systems

Parallel Comput

(2010)

S. Zhao et al.

Comparison of improved finite-difference WENO schemes for the implicit large eddy simulation of turbulent non-reacting and reacting high-speed shear flows

Comput Fluids

(2014)

G.-S. Jiang et al.

Efficient implementation of weighted ENO schemes

J Comput Phys

(1996)

A.K. Henrick et al.

Simulations of pulsating one-dimensional detonations with true fifth order accuracy

J Comput Phys

(2006)

R. Borges et al.

An improved weighted essentially non-oscillatory scheme for hyperbolic conservation laws

J Comput Phys

(2008)

M.P. Martín et al.

A bandwidth-optimized WENO scheme for the effective direct numerical simulation of compressible turbulence

J Comput Phys

(2006)

Cited by (35)

Heterogeneous CPU-GPU parallelization for modeling supersonic reacting flows with detailed chemical kinetics
2024, Computer Physics Communications
Accurate simulations of complex combustion phenomena associated with supersonic reacting flows require the use of detailed chemical kinetic mechanisms. However, detailed chemical mechanisms may consist of a large number of species and reactions resulting in extremely high computation cost. In order to accelerate the simulations of supersonic reacting flows with detailed chemical mechanisms, a heterogeneous CPU-GPU parallel algorithm and its portable software implementation are presented. The parallel algorithm was broken down into two parts: the CPU handled the computation of fluid dynamics while the GPU evaluated the chemical source terms and gas physical properties. The use of overlapping computations of chemical source terms on GPU and calculations of viscous flux on CPU is also presented. A study of performance tests was conducted. The performance results show that evaluating chemical source terms and gas physical properties on 2 GPUs are about 157.8× and 78.5× faster than running on the 16-core CPU when using the most complex mechanism on a grid of 3.3 million cells, respectively, resulting in an excellent speedup of the whole iteration up to 47. The significant performance improvement provided by the parallel algorithm can provide a significant perspective for designing heterogeneous CPU-GPU algorithms for applications in simulating supersonic reacting flows with detailed chemical kinetics.
Program Title: OpenHurricane
CPC Library link to program files: https://doi.org/10.17632/m9pphg9cjj.1
Licensing provisions: GPLv3
Programming language: C++, CUDA
Nature of problem: The computation cost in evaluating chemical source terms and theirs diagonal Jacobian matrices, and in calculating gas physical properties dominates the simulations of supersonic reacting flows with detailed chemical kinetic models. And the parallelization of these computations can yield great benefits from GPU acceleration.
Solution method: The parallel implementation of evaluating the chemical source terms and theirs diagonal Jacobian matrices, and calculating gas physical properties were conducted on GPU to speed up the computation.
A data-driven reduced-order model for stiff chemical kinetics using dynamics-informed training
2024, Energy and AI
A data-based reduced-order model (ROM) is developed to accelerate the time integration of stiff chemically reacting systems by effectively removing the stiffness arising from a wide spectrum of chemical time scales. Specifically, the objective of this work is to develop a ROM that acts as a non-stiff surrogate model for the time evolution of the thermochemical state vector (temperature and species mass fractions) during an otherwise highly stiff and nonlinear ignition process. The model follows an encode-forecast-decode strategy that combines a nonlinear autoencoder (AE) for dimensionality reduction (encode and decode steps) with a neural ordinary differential equation (NODE) for modeling the dynamical system in the AE-provided latent space (forecasting step). By means of detailed timescale analysis by leveraging the dynamical system Jacobians, this work shows how data-based projection operators provided by autoencoders can inherently construct the latent spaces by removing unnecessary fast timescales, even more effectively than physics-based counterparts based on an eigenvalue analysis. A key finding is that the most significant degree of stiffness reduction is achieved through an end-to-end training strategy, where both AE and neural ODE parameters are optimized simultaneously, allowing the discovered latent space to be dynamics-informed. In addition to end-to-end training, this work highlights the vital contribution of AE nonlinearity in the stiffness reduction task. For the prediction of homogeneous ignition phenomena for H $_{2}$ -air and C $_{2}$ H $_{4}$ -air mixtures, the proposed ROM achieves several orders-of-magnitude increase in the integration time step size when compared to (a) a baseline CVODE solver for the full-chemical system, (b) statistical technique – principal component analysis (PCA), and (c) computational singular perturbation (CSP), a vetted physics-based stiffness-reducing modeling framework.
A co-kurtosis PCA based dimensionality reduction with nonlinear reconstruction using neural networks
2024, Combustion and Flame
For turbulent reacting flow systems, identification of low-dimensional representations of the thermo-chemical state space is vitally important, primarily to significantly reduce the computational cost of device-scale simulations. Principal component analysis (PCA), and its variants, are a widely employed class of methods. Recently, an alternative technique that focuses on higher-order statistical interactions, co-kurtosis PCA (CoK-PCA), has been shown to effectively provide a low-dimensional representation by capturing the stiff chemical dynamics associated with spatiotemporally localized reaction zones. While its effectiveness has only been demonstrated based on a priori analyses with linear reconstruction, in this work, we employ nonlinear techniques to reconstruct the full thermo-chemical state and evaluate the efficacy of CoK-PCA compared to PCA. Specifically, we combine a CoK-PCA-/PCA-based dimensionality reduction (encoding) with an artificial neural network (ANN) based reconstruction (decoding) and examine, a priori, the reconstruction errors of the thermo-chemical state. In addition, we evaluate the errors in species production rates and heat release rates, which are nonlinear functions of the reconstructed state, as a measure of the overall accuracy of the dimensionality reduction technique. We employ four datasets to assess CoK-PCA/PCA coupled with ANN-based reconstruction: zero-dimensional (homogeneous) reactor for autoignition of an ethylene/air mixture that has conventional single-stage ignition kinetics, a dimethyl ether (DME)/air mixture which has two-stage (low and high temperature) ignition kinetics, a one-dimensional freely propagating premixed ethylene/air laminar flame, and a two-dimensional dataset representing turbulent autoignition of ethanol in a homogeneous charge compression ignition (HCCI) engine. Results from the analyses demonstrate the robustness of the CoK-PCA based low-dimensional manifold with ANN reconstruction in accurately capturing the data, specifically from the reaction zones.
Heat release characteristics of ammonia flames in MILD conditions
2023, Fuel
Ammonia (NH₃) combustion is an attractive energy source with zero carbon dioxide (CO₂) emissions and a potential hydrogen-combustion enabler. In addition, the availability of ammonia and its amenability to transportation make it appealing to future energy supply. To provide fundamental insights into the heat release characteristics of ammonia in moderate or intense low oxygen dilution (MILD) combustion, one-dimensional laminar and two-dimensional turbulent premixed flames are studied numerically. In particular, the flame structure and heat release distribution of ammonia and ammonia-hydrogen premixed flames are examined under conventional and MILD, and preheated conditions, respectively. The results show that stoichiometric ammonia MILD flames differ from normal ammonia flames producing NH₂ earlier, which results in the formation of a radical pool that shifts the chemical pathway towards hydrogen-related chemistry. The NH₂ radical serves as a precursor for the ammonia flame and characterizes the reaction zone, and plays a crucial role in bringing the explosive nature into the system. For fixed ratios of the root-mean-square turbulent velocity fluctuation to the laminar flame speed and integral length scale to the laminar flame thickness, the flame surface wrinkling is comparable between the stoichiometric ammonia flame at MILD and non-MILD conditions. Furthermore, the hydrogen enrichment of the lean ammonia flame results in a significant increase in heat release rate due to the preferential diffusion effect. The study also examines chemical markers to identify heat-releasing regions of the flames and proposes the use of NH₂ × O (on a mass fraction basis) as a suitable indicator of regions with high chemical activity and correlates well with the heat release for the MILD and non-MILD ammonia flames as well as ammonia-hydrogen flames at both laminar and turbulent conditions.
Highly-scalable GPU-accelerated compressible reacting flow solver for modeling high-speed flows
2023, Computers and Fluids
Emerging supercomputing systems utilize a combination of central processing units (CPUs) and graphics processing units (GPUs) in an effort to reach exascale capabilities while minimizing the energy footprint of operating such systems. Such heterogeneous machines introduce new challenges for fluids solvers because the hardware architecture and operation of a GPU are fundamentally different from conventional CPUs. In this work, a general approach for efficient implementation of finite-volume based reacting flow solvers on such heterogeneous systems is presented. Three main challenges, namely, data access pattern, thread divergence, and thread safety, are addressed. Since compressible reacting flows require special methods to deal with chemical reactions, hyperbolic and nonlinear convection terms, and the presence of turbulence, specific algorithms that ensure GPU-based efficiency are developed. The approach is demonstrated on the widely available OpenFOAM open source software by modifying core algorithms for GPU accessibility. The scalability of the resulting solver, is demonstrated using practical test cases, including flow through a scramjet engine and the dynamics of a rotating detonation engine. The solver provides near-ideal scaleup on a large number of GPUs ( $> 3000$ ), and extremely efficient use of the GPUs, with throughput nearly a constant even when processing a large number of control volumes.
Detonation peninsula of different stoichiometric ammonia/hydrogen/air mixtures under engine-relevant conditions
2023, Combustion and Flame
This study numerically investigates the detonation development of carbon-free fuels, namely ammonia and hydrogen (NH $_{3}$ and H $_{2}$ ), using one-dimensional (1D) simulations under the end-gas autoignitive conditions relevant to internal combustion (IC) engines. Five stoichiometric NH $_{3}$ /H $_{2}$ /air mixtures with different NH $_{3}$ /H $_{2}$ blending ratios are studied. A 1D hot spot with varied lengths and temperature gradients is used to induce different ignition modes. The detonation peninsulas are quantitatively identified by two non-dimensional parameters, namely the resonance parameter, $ξ$ , and the reactivity parameter, $ε$ . Increasing the H $_{2}$ blending ratio up to 80% results in a unique horn-shaped detonation peninsula, i.e., the magnitude of the upper and lower $ξ$ limits, $ξ_{u, l}$ , near the leftmost boundaries of the detonation peninsula of the rich-H $_{2}$ mixtures becomes larger by an order of magnitude as compared to those of the lean-H $_{2}$ mixtures. Such behavior is attributed primarily to the large heat diffusion of hydrogen, leading to rapid heat dissipation of the hot spot and the significantly decreased transient $ξ$ over time, thus promoting detonation development. The analysis reveals that the characterization of detonation propensity in the rich-H $_{2}$ mixtures needs to account for the fast heat diffusion of the initial hot spot, in which the initial magnitude of $ξ$ is not representative of its detonability. As such, a correction factor, $β$ , weighted by the ignition Damköhler number, is proposed to resolve the discrepancy of the $ξ_{u, l}$ limits between different NH $_{3}$ /H $_{2}$ /air mixtures. With this correction, the transient magnitudes of $ξ$ , $ξ_{t}$ , prior to the main ignition are well predicted such that a unified shape of the detonation peninsula for different NH $_{3}$ /H $_{2}$ /air mixture compositions is achieved.

View all citing articles on Scopus

^☆: Notice: This manuscript has been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan http://energy.gov/downloads/doe-public-access-plan.

View full text

Direct numerical simulations of turbulent reacting flows with shock waves and stiff chemistry using many-core/GPU acceleration☆

Highlights

Abstract

Introduction

Section snippets

Governing equations

Numerical method

Sod’s shock tube problem

Performance characteristics

Conclusions

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgments

Prog Energy Combust Sci

Proc Combust Inst

Prog Energy Combust Sci

Proc Combust Inst

Combust Flame

Combust Flame

Proc Combust Inst

Proc Combust Inst

Combust Flame

Combust Flame

Combust Flame

Fuel

Combust Flame

Proc Combust Inst

Proc Combust Inst

Proc Combust Inst

Combust Flame

Combust Flame

Combust Flame

Proc Combust Inst

Appl Numer Math

Appl Numer Math

J Comput Phys

Comput Fluids

Comput Fluids

Combust Flame

J Comput Phys

Comput Fluids

J Comput Phys

Parallel Comput

Comput Fluids

J Comput Phys

J Comput Phys

J Comput Phys

J Comput Phys