1 Introduction

The Sabatier process, named after the French chemists Paul Sabatier and Jean-Baptiste Senderens who reported it in 1902 [1], is given by the reversible exothermic reaction

$$\begin{aligned} \hbox {CO}_2 + 4 \hbox {H}_2 \rightleftharpoons \hbox {CH}_4 + 2\hbox {H}_2\hbox {O}, \quad \Delta H^0 \approx -165\,\hbox {kJ/mol}\, (\text {at } 25^{\circ }\hbox {C}). \end{aligned}$$
(1.1)

This reaction has been investigated, e.g., in the context of in situ resource utilization on mars [2, 3], for life-support systems on the ISS [4], and it is also used for power-to-gas applications [5, 6], and cogeneration systems [7, 8]. An overview over these and various other applications can be found, e.g., in [9]. Microchannel geometries are particularly interesting for chemical reactions as the large specific surface area of the microchannels allows for high performance of catalytic reactions as well as precise temperature management by means of appropriate temperature control systems. Such reactors have already been investigated for the Sabatier reaction, e.g., in [2, 3, 10, 11].

The purpose of this paper is to investigate the optimization of such a microchannel reactor for the Sabatier reaction using techniques from PDE-constrained optimization. Such methods are widely used in various physical applications, e.g., for the optimization of chemical reactions in conventional sized reactors [12,13,14], the optimization of semiconductors [15, 16], glass cooling processes [17, 18], or for the optimal shape design of microchannel cooling systems [19, 20], aircrafts [21, 22], and polymer spin packs [23, 24]. The optimization of microchannel reactors has also been investigated previously using derivative-free approaches only, e.g., in [25,26,27,28]. To the best of our knowledge, the derivative-based optimization of microchannel reactors with methods from PDE-constrained optimization has not been considered in the literature so far.

Throughout this paper, we consider the same setting as in [11, 29], where a microchannel reactor for the Sabatier reaction is investigated by means of experiments and simulations. To model this reactor mathematically, we introduce the following two models. First, we present a three-dimensional model that contains all important physical and chemical effects of the reactor. Second, we derive a one-dimensional model from the first one using a homogenization procedure similar to [19]. As both models are given by strongly coupled and highly nonlinear systems of PDEs, we use our software package cashocs [30] for the numerical solution of the subsequent optimization problems. In particular, cashocs is used to determine the relevant kinetic parameters for both models by solving a parameter identification problem constrained by the one-dimensional reactor model. The obtained results show excellent agreement with the experimental results reported in [11, 29]. Subsequently, a numerical comparison of both models shows that the one-dimensional model approximates the three-dimensional one very well, which validates our approach of using the one-dimensional model as the PDE constraint for the parameter identification.

Finally, we consider the following two optimization problems for the reactor. First, we use a cost functional based on the tracking of the \(\hbox {CO}_{2}\) conversion at the reactor outlet and treat the surrounding temperature of the reactor, i.e., its wall temperature, which can be influenced by means of appropriate temperature control systems, as optimization variable while keeping the inlet flow rate fixed. In the second case, we consider the inlet gas velocity of the reactor and its wall temperature as optimization variables and use a objective functional for maximizing the mass flow rate in the reactor. In this case, the quality of the product is ensured by use of a state constraint for the \(\hbox {CO}_{2}\) conversion. Both problems are solved numerically using our software cashocs, and the obtained results show great potential for improving the design of microchannel reactors.

This paper is structured as follows. We begin with the introduction of our mathematical models and an investigation of the Sabatier reaction in Sect. 2. Afterwards, in Sect. 3, we give a brief description of our software package cashocs which is used for the automated treatment of optimal control problems considered in the following sections. The necessary kinetic reaction parameters are determined in Sect. 4, where we also compare both reactor models numerically. Finally, we investigate the numerical optimization of the reactor and discuss the potential for optimizing its design in Sect. 5.

2 Model formulation

We first give some preliminary notations regarding our setting. Afterwards, we introduce a three- and a one-dimensional model for the reactor and investigate the behavior of the Sabatier reaction. We conclude this section by detailing the numerical solution procedure for both models.

Fig. 1
figure 1

Schematic of the problem setting with inlet part \(\varOmega ^\text {in}\) (blue) and catalytic reaction part \(\varOmega ^\text {reac}\) (orange). (Color figure online)

2.1 Preliminary notations

For the modeling of the Sabatier process in a microchannel reactor, we consider the experimental setting from [11, 29], where such a reactor is investigated by experiments and simulations. The reactor consists of 80 identical microchannels, each having a width of \(W = 450\,\upmu \hbox {m}\), a height of \(H = 150\,\upmu \hbox {m}\), and a length of \(L = 5\,\hbox {cm}\) (cf. Table 1). As in [11], we assume that the flow is distributed uniformly between the channels so that we can model the behavior of the whole reactor by simulating a single channel, which is a well-established assumption in the literature [11, 31, 32]. Additionally, we consider an inlet section, where we assume that no reaction occurs, with a length of \(\nicefrac {L}{10}\) in front of the catalytic reaction part. Hence, throughout this paper, the geometry of the entire reactor is denoted by \(\varOmega = (-\nicefrac {L}{10}, L) \times (0, W) \times (0, H)\) which is divided into the catalytic reactor domain \(\varOmega ^\text {reac}= (0, L) \times (0, W) \times (0, H)\) and the inlet domain \(\varOmega ^\text {in}= (-\nicefrac {L}{10}, 0) \times (0, W) \times (0, H)\), where we have no catalyst and, hence, can neglect the chemical reaction. The boundary \(\varGamma = \partial \varOmega \) of the channel is divided into three disjoint parts. The inlet \(\varGamma ^\text {in}\) at \(x=-\nicefrac {L}{10}\), where the fluid enters the domain, the wall boundary \(\varGamma ^\text {wall}\), which bounds the domain and encloses the flow, and the outlet \(\varGamma ^\text {out}\) at \(x=L\), where the fluid leaves the domain. This setting is schematically depicted in Fig. 1, where the geometry of the reactor is shown. For further details and schematics, we refer to [11]. Finally, we remark that since the authors of [11, 29] only considered stoichiometric inlet conditions, i.e., mole fractions of \(\nicefrac {1}{5}\) for \(\hbox {CO}_{2}\) and \(\nicefrac {4}{5}\) for \(\hbox {H}_{2}\), we only consider these as inlet conditions throughout this paper.

2.2 Mathematical model

We assume that the chemically reacting gas mixture obeys the ideal gas law \(\rho = ({p_{\text {tot}}M})/({RT})\), where \(p_{\text {tot}}\) denotes the total pressure of the gas, \(M\) is its average molar mass, \(T\) its temperature, and \(R\) is the universal gas constant. Further, we assume that the flow is weakly compressible, which implies that \(p_{\text {tot}}\) can be written as \(p_{\text {tot}}= p_{\text {ref}}+ p\) with a constant reference pressure \(p_{\text {ref}}\) and pressure variations \(p\), so that \(p\ll p_{\text {ref}}\) and \(\nabla p_{\text {tot}}= \nabla p\), as in [33]. Hence, we can approximate the ideal gas law by

$$\begin{aligned} \rho = \frac{p_{\text {ref}}M}{RT} \end{aligned}$$
(2.1)

as the contribution of the pressure variations towards the change in density of the gas is negligible. Moreover, for the derivation of a general model for chemically reacting fluid flow, we assume that the reaction is described by the following system of \(N_{\text {r}}\) elementary and reversible chemical reactions between \(N_{\text {s}}\) chemical species, denoted by the symbol \(\mathcal {M}_k\) for \(k=1,\ldots , N_{\text {s}}\),

$$\begin{aligned} \sum _{k=1}^{N_{\text {s}}} \nu _{k, j}' \mathcal {M}_k\rightleftharpoons \sum _{k=1}^{N_{\text {s}}} \nu _{k, j}'' \mathcal {M}_k, \quad j=1,\ldots ,N_{\text {r}}, \end{aligned}$$
(2.2)

where \(\nu _{k, j}\) is the (non-negative) stoichiometric coefficient of species k in reaction j. Note, that the reaction also affects the density of the gas mixture [cf. (2.1)] by changing its average molar mass M, which is defined as follows:

$$\begin{aligned} M= \left( \sum _{k=1}^{N_{\text {s}}} \frac{Y_k}{M_{k}}\right) ^{-1}, \end{aligned}$$

where \(Y_k\) is the mass fraction of species k, which are detailed below, and \(M_{k}\) is the average molar mass of species k.

We describe the chemically reacting flow in the channel by the following system of PDEs:

$$\begin{aligned} \left\{ \quad \begin{array}{lll} \nabla \cdot \left( \rho \varvec{u}\right) &{}= 0 \quad \text { in } \varOmega , \\ \rho \left( \varvec{u}\cdot \nabla \right) \varvec{u}+ \nabla p- \nabla \cdot \left( \mu \nabla \varvec{u}\right) - \nabla \left( \nicefrac {\mu }{3}\ \nabla \cdot u \right) &{}= 0 \quad \text { in } \varOmega , \\ \rho C_\mathrm{p}\ \varvec{u}\cdot \nabla T- \nabla \cdot \left( \kappa \nabla T\right) - \dot{\omega }_{\mathrm{T}, \text {r}}- \dot{\omega }_{\mathrm{T}, \text {d}}&{}= 0 \quad \text { in } \varOmega , \\ \rho \ \varvec{u}\cdot \nabla Y_k+ \nabla \cdot \left( \rho \varvec{V^\mathrm{c}}Y_k\right) - \nabla \cdot \left( \rho D_{\mathrm{k}, \text {mix}}\nabla Y_k\right) - \dot{\omega }_{k}&{}= 0 \quad \text { in } \varOmega , \quad k=1, \ldots , N_{\text {s}}, \end{array} \right. \end{aligned}$$
(2.3)

where \(\varvec{u}\) denotes the gas mixture’s velocity, \(T\) its temperature, and \(\nabla p\) corresponds to the gradient of the pressure (cf. the remarks above). Moreover, \(Y_k\) denotes the mass fraction of species k, which are combined to the vector \(\varvec{Y^{\text {vec}}}= [Y_1, \ldots , Y_{N_{\text {s}}}]^\top \). Since the mass fractions sum to unity, we can get rid of, e.g., the last of the chemical species and compute its mass fraction via \(Y_{N_{\text {s}}} = 1 - \sum _{k=1}^{N_{\text {s}}- 1} Y_k\), which decreases the computational complexity of the model. Further, \(\rho \), \(\mu \), \(C_\mathrm{p}\), and \(\kappa \) denote the gas mixture’s density, viscosity, specific heat capacity, and thermal conductivity, respectively, and \(D_{\mathrm{k}, \text {mix}}\) denotes the mixture-averaged diffusion coefficients. Note, that the equations of the PDE system (2.3) model the conservation of mass, momentum, enthalpy, and chemical species, respectively.

The term \(\dot{\omega }_{k}\) models the conversion between the species due to the chemical reaction (2.2) and is given by

$$\begin{aligned} \dot{\omega }_{k}= {\left\{ \begin{array}{ll} 0 \quad &{}\text { in } \varOmega ^\text {in}, \\ M_{k}\sum _{j=1}^{N_{\text {r}}} \nu _{k, j} Q_j \quad &{}\text { in } \varOmega ^\text {reac}, \end{array}\right. } \end{aligned}$$

where we define \(\nu _{k, j} = \nu _{k, j}'' - \nu _{k, j}'\) and \(M_{k}\) is the molar mass of species k. Further, \(Q_j\) is the rate of progress of reaction j, given by

$$\begin{aligned} Q_j = k_{\text {f}, \mathrm{j}}\left( \prod _{k=1}^{N_{\text {s}}} [X_k]^{\nu _{k, j}'} - \frac{\prod _{k=1}^{N_{\text {s}}} [X_k]^{\nu _{k, j}''}}{k_{\text {eq}, \mathrm{j}}} \right) , \end{aligned}$$
(2.4)

where \([X_k] = ({\rho Y_k})/{M_{k}}\) denotes the molar concentration of species k and \(k_{\text {eq}, \mathrm{j}}\) is the equilibrium constant of reaction j, which is detailed in “Appendix A”. Further, \(k_{\text {f}, \mathrm{j}}\) is the forward rate constant of reaction j, which is modeled by an Arrhenius law

$$\begin{aligned} k_{\text {f}, \mathrm{j}}= A_j\exp \left( - \frac{E_{\text {a}, \mathrm{j}}}{RT} \right) , \end{aligned}$$

where \(A_j\) is the so-called pre-exponential factor and \(E_{\text {a}, \mathrm{j}}\) is the activation energy for reaction j. Note, that the reaction source term \(\dot{\omega }_{k}\) vanishes in \(\varOmega ^\text {in}\) and is only active in \(\varOmega ^\text {reac}\), as discussed in Sect. 2.1. Further, we remark that (2.4) is only valid for elementary chemical reactions, hence our assumption for (2.2).

The correction velocity \(\varvec{V^\mathrm{c}}\), which is defined as

$$\begin{aligned} \varvec{V^\mathrm{c}}= \sum _{k=1}^{N_{\text {s}}} D_{\mathrm{k}, \text {mix}}\nabla Y_k, \end{aligned}$$

is used to ensure the consistency between the mass and species conservation equations for the mixture-averaged diffusion model, as is explained in detail, e.g., in [33, 34]. The heat generated by the reaction is modeled through the source term \(\dot{\omega }_{\mathrm{T}, \text {r}}\), which is given by

$$\begin{aligned} \dot{\omega }_{\mathrm{T}, \text {r}}= -\sum _{k=1}^{N_{\text {s}}} h_{k}(T)\ \dot{\omega }_{k}, \end{aligned}$$

where \(h_{k}\) is the pure species specific enthalpy (cf. “Appendix A”). Finally, the heat generated due to the molecular diffusion of the species is modeled by the term \(\dot{\omega }_{\mathrm{T}, \text {d}}\), which reads

$$\begin{aligned} \dot{\omega }_{\mathrm{T}, \text {d}}= - \left( \rho \sum _{k=1}^{N_{\text {s}}} C_{\mathrm{p}, \mathrm{k}}\left( Y_k\varvec{V^\mathrm{c}}- D_{\mathrm{k}, \text {mix}}\nabla Y_k\right) \right) \cdot \nabla T. \end{aligned}$$

Note that all transport parameters for the model depend on the fluid’s chemical composition, represented by the mass fractions \(\varvec{Y^{\text {vec}}}\), and on the fluid’s temperature \(T\) through constitutive relations. In particular, we have that

$$\begin{aligned} \rho = \rho (\varvec{Y^{\text {vec}}}, T), \quad \mu = \mu (\varvec{Y^{\text {vec}}}, T), \quad C_\mathrm{p}= C_\mathrm{p}(\varvec{Y^{\text {vec}}}, T), \quad \kappa = \kappa (\varvec{Y^{\text {vec}}}, T), \quad D_{\mathrm{k}, \text {mix}}= D_{\mathrm{k}, \text {mix}}(\varvec{Y^{\text {vec}}}, T), \end{aligned}$$

which leads to a strong and highly nonlinear coupling between the individual equations of (2.3). The constitutive relations detailing these dependencies are given in “Appendix A”.

The system (2.3) is supplemented with the following boundary conditions. On the inlet \(\varGamma ^\text {in}\) we use Dirichlet conditions to prescribe the ingoing velocity, temperature, and mass fractions of the gas mixture, i.e., we use

$$\begin{aligned} \left\{ \quad \begin{array}{lll} \varvec{u}&{}= \varvec{u}^\text {in}\quad \text { on } \varGamma ^\text {in}, \\ T&{}= T^\text {in}\quad \text { on } \varGamma ^\text {in}, \\ Y_k&{}= Y_k^\text {in}\quad \text { on } \varGamma ^\text {in},\quad k=1,\ldots ,N_{\text {s}}. \end{array} \right. \end{aligned}$$

On the wall boundary \(\varGamma ^\text {wall}\), we use the usual no-slip condition for the velocity, which is valid as the Knudsen number of the flow is sufficiently small so that the continuum assumption holds. Additionally, we use a no-flux condition for the chemical species, which models that the species do not leave the reactor over the wall boundary, and a Dirichlet condition for the gas mixture’s temperature. The latter models that the wall temperature of the reactor can be influenced through appropriate temperature control systems (cf. Sect. 5.3). In summary, the boundary conditions on \(\varGamma ^\text {wall}\) are given by

$$\begin{aligned} \left\{ \quad \begin{array}{lll} \varvec{u}&{}= 0 \quad \text { on } \varGamma ^\text {wall}, \\ T&{}= T^\text {wall}\quad \text { on } \varGamma ^\text {wall}, \\ \rho \left( \varvec{V^\mathrm{c}}Y_k- D_{\mathrm{k}, \text {mix}}\nabla Y_k\right) \cdot \varvec{n}&{}= 0 \quad \text { on } \varGamma ^\text {wall},\quad k=1,\ldots ,N_{\text {s}}, \end{array} \right. \end{aligned}$$
(2.5)

where \(\varvec{n}\) denotes the unit outer normal vector on \(\varGamma \).

Finally, the unimpeded flow of the gas out of the reactor at the outlet \(\varGamma ^\text {out}\) is modeled by a do-nothing condition for the momentum equation and homogeneous Neumann conditions for the temperature and chemical species, i.e.,

$$\begin{aligned} \left\{ \quad \begin{array}{ll} \mu \nabla \varvec{u}\ \varvec{n}+ \frac{\mu }{3} \left( \nabla \cdot \varvec{u}\right) \varvec{n}- p\ \varvec{n}&{}= 0 \quad \text { on } \varGamma ^\text {out}, \\ \kappa \nabla T\cdot \varvec{n}&{}= 0 \quad \text { on } \varGamma ^\text {out}, \\ \rho \left( \varvec{V^\mathrm{c}}Y_k- D_{\mathrm{k}, \text {mix}}\nabla Y_k\right) \cdot \varvec{n}&{}= 0 \quad \text { on } \varGamma ^\text {out},\quad k=1,\ldots ,N_{\text {s}}. \end{array} \right. \end{aligned}$$

2.3 The Sabatier reaction in microchannel reactors

The Sabatier reaction (1.1) is a reversible exothermic reaction used to convert \(\hbox {CO}_{2}\) and \(\hbox {H}_{2}\) into \(\hbox {CH}_{4}\) and \(\hbox {H}_{2}\hbox {O}\). For most of its applications, it is desirable that the reaction proceeds as far as possible to the right, so that ideally all of the \(\hbox {CO}_{2}\) would be consumed by the reaction. It is in this regard that we consider optimizing the microchannel reactor under investigation (cf. Sect. 5). As a measure for how far the reaction has already proceeded, we use the \(\hbox {CO}_{2}\) conversion, which is defined as follows:

$$\begin{aligned} \chi ^{\mathrm{CO}_{2}} = 1 - \frac{Y_{\mathrm{CO}_{2}}}{Y^\text {in}_{\mathrm{CO}_{2}}} = 1 - \frac{n_{\mathrm{CO}_{2}}}{n^\text {in}_{\mathrm{CO}_{2}}}, \end{aligned}$$
(2.6)

where \(n^\text {in}_{\mathrm{CO}_{2}}\) denotes the molar amount of \(\hbox {CO}_{2}\) entering the domain. In this paper, we only evaluate \(Y_{\mathrm{CO}_{2}}\) and \(n_{\mathrm{CO}_{2}}\), which is the molar amount of \(\hbox {CO}_{2}\) in the gas mixture, at the outlet of the reactor (cf. Sects. 4 and 5), so that \(\chi ^{\mathrm{CO}_{2}}\) describes the total \(\hbox {CO}_{2}\) conversion of the reactor.

Let us now investigate the behavior of the Sabatier reaction. First of all, we mention that there is another possible reaction between the reactants \(\hbox {CO}_{2}\) and \(\hbox {H}_{2}\), namely the reverse water gas shift (RWGS) reaction, given by

$$\begin{aligned} {\hbox {CO}_2 + \hbox {H}_2 \rightleftharpoons \hbox {CO} + \hbox {H}_2\hbox {O}}, \quad \Delta H^0 \approx +41\,\hbox {kJ/mol}\quad (\text {at } 25^{\circ }\hbox {C}). \end{aligned}$$

However, the RWGS reaction only becomes important for temperatures over approximately 550–\(600\,^{\circ }\hbox {C}\) and shows negligible \(\hbox {CO}\) production at lower temperatures [35, 36]. For these reasons, we proceed as in [35, 36] and restrict the temperatures under investigation to be below \({600}\,^{\circ }\hbox {C}\) and neglect the RWGS reaction, which only introduces a limited error in our model.

The next point we need to address is that the Sabatier reaction is not an elementary reaction, which is required for the formula of the rate of progress (2.4). However, we decided not to break down the Sabatier reaction into a system of elementary reactions due to the following reasons. First, the Sabatier reaction is not fully understood yet from the viewpoint of elementary reaction mechanisms, with two possibilities proposed [37]. Second, using a system of elementary reactions would increase the computational complexity of the model significantly as this would introduce additional variables for the intermediate chemical species. For these reasons, we use the following modified formula for the rate of progress (2.4) developed in [38,39,40]

$$\begin{aligned} Q = k_{\text {f}}\left( \left( \prod _{k=1}^{N_{\text {s}}} [X_k]^{\nu _{k, j}'} \right) ^{n} - \left( \frac{\prod _{k=1}^{N_{\text {s}}} [X_k]^{\nu _{k, j}''}}{k_{\text {eq}}} \right) ^{n} \right) , \end{aligned}$$
(2.7)

where \(n\) is an empirical exponent. Note, that (2.7) models the Sabatier process using a single chemical reaction, so that we have \(N_{\text {r}}= 1\) and, hence, drop the index j throughout the rest of the paper. This approach has been used extensively in the literature to model the Sabatier reaction, e.g., in [6, 10, 11, 29, 35, 41, 42]. To model the forward rate constant \(k_{\text {f}}\), we again use an Arrhenius law, i.e.,

$$\begin{aligned} k_{\text {f}}= A\exp \left( - \frac{E_{\text {a}}}{RT} \right) . \end{aligned}$$
(2.8)

It is straightforward to see that both formulations yield the same thermodynamic equilibrium as explained in [39], which is a crucial feature of (2.7). Note, that even though this model for the Sabatier reaction is comparatively simple, it still covers the most important effects of the reactor. Hence, it is feasible that we consider this model to investigate the potential for improving the microchannel reactor. An investigation of more sophisticated reaction mechanisms or the inclusion of the RWGS reaction into the reactor model is beyond the scope of this paper and could be considered in future work.

Fig. 2
figure 2

Properties of the Sabatier reaction for stoichiometric inlet conditions

Let us now take a look at some important properties of the Sabatier reaction which are depicted in Fig. 2 for stoichiometric inlet conditions (cf. Sect. 2.1). In Fig. 2a the temperature dependence of the equilibrium \(\hbox {CO}_{2}\) conversion is shown for pressures of 1 bar, 5 bar, and 10 bar, which resemble the operating pressures investigated in [11, 29]. We observe that the \(\hbox {CO}_{2}\) conversion decreases monotonically with temperature and increases monotonically with pressure. Both effects are direct consequences of Le Chatelier’s principle [43] applied to the Sabatier reaction. Hence, we only consider the case \(p_{\text {ref}}= 10\,\hbox {bar}\) in this paper as the \(\hbox {CO}_{2}\) conversion is largest for this case. In Fig. 2b we can see the influence of the temperature on the equilibrium composition at the operating pressure of 10 bar. Again, we observe that the composition is more favorable for low temperatures of about \(200\,^{\circ }\hbox {C}\), where it consists almost exclusively of the products \(\hbox {CH}_{4}\) and \(\hbox {H}_{2}\hbox {O}\). The equilibrium composition becomes increasingly worse with higher temperatures as the amount of reactants increases considerably, mirroring our previous discussion regarding the equilibrium \(\hbox {CO}_{2}\) conversion. Finally, Fig. 2c shows the dependence of the rate of progress and, hence, of the reaction rate on the temperature at different levels of \(\hbox {CO}_{2}\) conversion, which we computed using the parameters determined later on (cf. Sect. 4). We see that the rate of progress increases strongly with temperature for low levels of \(\hbox {CO}_{2}\) conversion. However, the greater the \(\hbox {CO}_{2}\) conversion, the lower the rate of progress becomes. For the highest levels of \(\hbox {CO}_{2}\) conversion, we observe that the maximum rate of progress, highlighted by the markers on the graphs, is obtained at progressively lower temperatures. Additionally, for sufficiently high levels of \(\hbox {CO}_{2}\) conversion and temperatures, the rate of progress even becomes negative and, thus, reverses the direction of the reaction. These are direct consequences of the thermodynamic equilibrium which pushes the reaction into the reverse direction under such conditions.

Regarding the optimization of the reactor, these characteristics lead to the following considerations. To obtain a high reaction rate, high temperatures are desirable, especially for low levels of \(\hbox {CO}_{2}\) conversion, as shown in Fig. 2c. However, if the temperature is kept high throughout the entire reactor, the thermodynamic limitations of the reaction severely constrain the maximum achievable \(\hbox {CO}_{2}\) conversion. On the other side, low temperatures are thermodynamically favorable, but the corresponding low reaction rate only leads to low levels of \(\hbox {CO}_{2}\) conversion. These considerations lead to the conclusion that, ideally, the reactor should have high temperatures near the inlet, where we have low levels of \(\hbox {CO}_{2}\) conversion, so that we have a high reaction rate initially. Subsequently, the temperature should decrease along the channel as this leads to more favorable equilibrium compositions and is also beneficial for the rate of progress at higher levels of \(\hbox {CO}_{2}\) conversion, which are confirmed in Sect. 5, where we investigate the optimization of the reactor. Finally, we note that the above deliberations regarding the influence of the temperature on the reaction is inherent in all exothermic reactions, and not specific to our choice of the Sabatier reaction, which is, again, a consequence of Le Chatelier’s principle [43].

2.4 A one-dimensional model of the reactor

The PDE system (2.3) of Sect. 2.2 completely models the behavior of chemically reacting fluid flow for a homogeneous reaction. However, a major drawback of the model is that it is three dimensional, which makes the numerical simulation of the reactor comparatively costly and leads to a substantial bottleneck for the adjoint-based optimization of the reactor we consider in Sects. 4 and 5. To remedy this, we now use ideas similar to [19] to derive a reduced one-dimensional model of the reactor which is significantly easier to solve numerically.

As a first step, we interpret the geometry as a porous medium with porosity 1. While the boundary \(\varGamma ^\text {wall}\) is still present from this point of view due to the geometrical constraints, it now only acts as a mathematical boundary of the domain and does not interact physically with the fluid. To model the fluid velocity in such a porous medium, we use a Brinkman equation with slip boundary conditions, in analogy to [19] and the references therein, given by the system

$$\begin{aligned} \left\{ \quad \begin{array}{ll} \rho \left( \varvec{u}\cdot \nabla \right) \varvec{u}+ \nabla p- \nabla \cdot \left( \mu \nabla \varvec{u}\right) - \nabla \left( \nicefrac {\mu }{3}\ \nabla \cdot u \right) + \mu K^{-1} \varvec{u}= 0 &{}\quad \text { in } \varOmega , \\ \varvec{u}\cdot \varvec{n}= 0 &{} \quad \text { on } \varGamma ^\text {wall}, \\ \mu \ \partial _{\varvec{n}} u \times \varvec{n}= 0 &{}\quad \text { on } \varGamma ^\text {wall}, \end{array} \right. \end{aligned}$$

where \(K\) denotes the permeability of the channel. Note that the permeability is a property of the geometry only. It can be determined either analytically or numerically [44]. Its value for our setting can be found in Table 1. For the porous medium model, the above equations replace the momentum equation in (2.3) as well as the boundary condition for the velocity on \(\varGamma ^\text {wall}\) in (2.5).

Additionally, we have to modify the temperature equation as we can no longer incorporate the effect of the wall temperature on the gas mixture via a Dirichlet boundary condition due to the previously mentioned reasons. Hence, we proceed analogously to [19] and use the following equation:

$$\begin{aligned} \left\{ \quad \begin{array}{lll} \rho C_\mathrm{p}\ \varvec{u}\cdot \nabla T- \nabla \cdot \left( \kappa \nabla T\right) + h_{\text {f,s}}\left( T- T^\text {wall}\right) - \dot{\omega }_{\mathrm{T}, \text {r}}- \dot{\omega }_{\mathrm{T}, \text {d}}= 0 &{}\quad \text { in } \varOmega , \\ \rho C_\mathrm{p}\ u\cdot \varvec{n}- \kappa \nabla T\cdot \varvec{n}= 0 &{}\quad \text { on } \varGamma ^\text {wall},\\ \end{array} \right. \end{aligned}$$

where \(h_{\text {f,s}}\) is the so-called interfacial heat transfer coefficient which models the heat transfer between the gas mixture and the wall. We computed the value of \(h_{\text {f,s}}\) numerically, as described in [19] and the references therein, and the result is shown in Table 1. The no-flux boundary condition for the temperature again models that the temperature does not interact with the wall boundary.

Averaging the system with the modifications described above over the channel cross section yields a one-dimensional model of the microchannel reactor, which is given by

$$\begin{aligned} \left\{ \quad \begin{array}{lll} \partial _x \left( \rho \varvec{u}\right) &{}= 0 \quad \text { in } \varOmega , \\ \rho \left( \varvec{u} \partial _x \varvec{u}\right) + \partial _x p- \partial _x \left( \mu \partial _x \varvec{u}\right) - \partial _x \left( \nicefrac {\mu }{3} \partial _x u \right) + \mu K^{-1} \varvec{u}&{}= 0 \quad \text { in } \varOmega , \\ \rho C_\mathrm{p} \varvec{u} \partial _x T- \partial _x \left( \kappa \partial _x T\right) + h_{\text {f,s}}\left( T- T^\text {wall}\right) - \dot{\omega }_{\mathrm{T}, \text {r}}- \dot{\omega }_{\mathrm{T}, \text {d}}&{}= 0 \quad \text { in } \varOmega , \\ \rho \varvec{u} \partial _x Y_k+ \partial _x \left( \rho \varvec{V^\mathrm{c}}Y_k\right) - \partial _x \left( \rho D_{\mathrm{k}, \text {mix}} \partial _x Y_k\right) - \dot{\omega }_{k}&{}= 0 \quad \text { in } \varOmega ,\quad k=1, \ldots , N_{\text {s}}, \end{array} \right. \end{aligned}$$
(2.9)

supplemented with the boundary conditions

$$\begin{aligned} \left\{ \quad \begin{array}{lll} \varvec{u}&{}= \varvec{u}^\text {in}\quad \text { on } \varGamma ^\text {in}, \\ T&{}= T^\text {in}\quad \text { on } \varGamma ^\text {in}, \\ Y_k&{}= Y_k^\text {in}\quad \text { on } \varGamma ^\text {in},\quad k=1,\ldots ,N_{\text {s}}, \\ \mu \partial _x \varvec{u}+ \frac{\mu }{3} \partial _x \varvec{u}- p&{}= 0 \quad \text { on } \varGamma ^\text {out}, \\ \kappa \partial _x T&{}= 0 \quad \text { on } \varGamma ^\text {out}, \\ \rho \left( \varvec{V^\mathrm{c}}Y_k- D_{\mathrm{k}, \text {mix}} \partial _x Y_k\right) &{}= 0 \quad \text { on } \varGamma ^\text {out},\quad k=1,\ldots ,N_{\text {s}}. \end{array} \right. \end{aligned}$$

Note, that the wall boundary is not present in (2.9) anymore, its influence is only modeled through the additional terms in the momentum and temperature equations, as discussed above. Additionally, for the sake of better readability we do not distinguish between the three-dimensional and the one-dimensional domain and denote both of them by \(\varOmega \) as it is obvious from the context to which we refer to.

Table 1 Parameters for the reactor geometry and one-dimensional model (2.9)

Finally, we mention that for (2.9) only the form of the PDE system has changed slightly, all constitutive relations remain the same as described in “Appendix A” and Sect. 2.3. Furthermore, except for the kinetic parameters for the forward rate constant (2.8), all parameters for the model are obtained using the fits given in [45] and [46] as discussed in “Appendix A”. We determine the missing parameters, namely the pre-exponential factor \(A\), the activation energy \(E_{\text {a}}\), and the empirical exponent \(n\), by solving a parameter identification problem in Sect. 4. Further, we compare the one-dimensional model (2.9) numerically to the three-dimensional one (2.3) in Sect. 4.3 after determining the aforementioned parameters. The corresponding results (cf. Fig. 5) show that the one-dimensional model approximates the three-dimensional one very well, so that it is justified to use the former as model for the reactor throughout the rest of this paper.

2.5 Numerical solution of the models

To conclude this section, we briefly describe the methods used for solving the PDE systems (2.3) and (2.9). First, for the one-dimensional model, we discretize the corresponding interval with a uniform mesh consisting of 1001 nodes, corresponding to 1000 line segments. Second, for the three-dimensional model, we also use a uniform mesh with 44,040 nodes, corresponding to 178,200 tetrahedrons. We use the finite element software FEniCS, version 2018.1 [47, 48] to discretize both systems with a mixed finite element method using the following finite elements. The fluid’s pressure, temperature, and mass fractions for \(\hbox {CO}_{2}\), \(\hbox {H}_{2}\), and \(\hbox {CH}_{4}\) are discretized with continuous, piecewise linear Lagrangian elements, and the fluid’s velocity is discretized with continuous, piecewise quadratic Lagrangian elements. This yields the well-known Taylor–Hood finite element pair for velocity and pressure that is LBB stable for the saddle point structure of the continuity and momentum equations. As explained earlier, we calculate the mass fraction of \(\hbox {H}_{2}\hbox {O}\) via the relation \(Y_{\mathrm{H}_{2}\mathrm{O}} = 1 - \left( Y_{\hbox {CO}_{2}} + Y_{\mathrm{H}_{2}} + Y_{\mathrm{CH}_{4}} \right) \). This discretization leads to a nonlinear system of equations which has to be solved. To do so, we use iterative methods which are described below and consider an initial guess given by zero velocity and pressure as well as a constant temperature and mass fractions determined by the respective inlet conditions for both models.

As the discretization described above leads to a system with 7006 degrees of freedom (DoF’s) for the one-dimensional system (2.9), we solve it monolithically using a damped Newton method with backtracking line search based on the natural monotonicity criterion described in [49, Chap. 3.3]. The arising linear systems for the Newton method are solved with the direct sparse linear solver MUMPS from the library PETSc [50].

The numerical solution of the resulting nonlinear system for the three-dimensional model (2.3) is more involved. As the system is considerably larger than the previously considered one, we do not use a monolithic approach for its solution. Instead we use a Picard-type fixed point iteration that consists of the following steps: first, we freeze the temperature and mass fractions and solve the continuity and momentum equations to obtain values for the velocity and pressure. In the second step, we freeze the velocity, pressure, and temperature, and solve the species conservation equation for the mass fractions. The final step consists of freezing the velocity, pressure, and mass fractions, and solving the temperature equation. This is repeated until the residual of the original system reaches a relative tolerance of 1e−10. Note that in each of these steps, a nonlinear system of equations has to be solved. This is done using the same damped Newton method as discussed above. In particular, the system of continuity and momentum equations has 176,160 DoFs, the species conservation equation has 132,120 DoFs, and the temperature equation has 44,040 DoFs. As before, all resulting linear systems are solved using the solver MUMPS since they are sufficiently small to be solved by a direct method.

Since the size of the systems for the one-dimensional model is significantly smaller than the size of the ones for the three-dimensional model, the numerical solution of the former is also considerably faster. In particular, a solve of the one-dimensional model takes a few seconds, whereas it takes about an hour to solve the three-dimensional model, i.e., we get a speedup of over two orders of magnitude. This enables the fast solution of the optimization problems investigated subsequently in Sects. 4 and 5.

3 Automated derivation of adjoint equations for the numerical solution of PDE-constrained optimization problems

The objective of this section is to briefly introduce our software package cashocs [30], which implements the adjoint approach and is used to solve the PDE-constrained optimization problems considered in Sects. 4 and 5. To do so, we briefly recall the adjoint approach for PDE-constrained optimization problems and then describe our implementation of this in cashocs, which mirrors this approach.

3.1 Recapitulation of the adjoint approach

Let us start by recalling the adjoint approach for PDE-constrained optimization problems. For a detailed introduction to this topic, we refer the reader, e.g., to the textbooks [51, 52]. The general form of a PDE-constrained optimization or optimal control problem is the following equation:

$$\begin{aligned} \min _{y, u}\ J(y, u) \quad \text { s.t. } \quad e(y, u) = 0, \quad u\in U_\text {ad}, \quad \text { and } \quad y\in Y_{\text {ad}}, \end{aligned}$$
(3.1)

where \(u \in U\) and \(y \in Y\) are the so-called control and state variables that are part of appropriate Banach spaces U and Y. Further, \(J:Y \times U \rightarrow \mathbb {R}\) is the cost functional and \(e:Y \times U \rightarrow Z^*\) is an operator between Banach spaces that models the PDE constraint, where \(Z^*\) denotes the topological dual space of some Banach space Z. In particular, the so-called state equation \(e(y, u) = 0\) is equivalent to

$$\begin{aligned} \left\{ \quad \begin{aligned}&\text {Find } y \in Y \text { such that }\\&\quad \left\langle e(y, u) , z \right\rangle _{Z^*, Z} = 0 \quad \text { for all } z\in Z, \end{aligned} \right. \end{aligned}$$
(3.2)

where \(\left\langle \varphi , x \right\rangle _{B^*, B}\) denotes the duality pairing of \(\varphi \in B^*\) and \(x\in B\) for a Banach space B, which we also write as \(\varphi [x]\). Note, that (3.2) often corresponds to a variational form of a PDE. Finally, \(U_\text {ad} \subset U\) and \(Y_\text {ad} \subset Y\) are the non-empty and closed sets of admissible controls and states, respectively. These are used to model control and state constraints for the problem and can be treated by appropriate algorithms, such as projection methods for box constraints on the control variable or regularization techniques for state constraints [51, 52].

We assume that the state equation (3.2) has a unique solution y(u) for every \(u\in U\) so that \(e(y(u), u) = 0\) for all \(u \in U\). This assumption allows us to define the so-called reduced cost functional \(\hat{J} :U \rightarrow \mathbb {R}\) by

$$\begin{aligned} \hat{J}(u) = J(y(u), u). \end{aligned}$$
(3.3)

In this way, we have formally eliminated the PDE constraint and can consider the following reduced optimization problem that is equivalent to (3.1)

$$\begin{aligned} \min _{u}\ \hat{J}(u) \quad \text { s.t. } \quad u\in U_\text {ad} \quad \text { and } \quad y(u) \in Y_{\text {ad}}. \end{aligned}$$

Our goal is to efficiently compute the gradient of the reduced cost functional \(\hat{J}'(u)\) which is to be used as part of derivative-based optimization algorithms for solving (3.1). To do so, we assume that J and e are continuously Fréchet differentiable, and that \(e_y(y(u), u)\), i.e., the Fréchet derivative of e w.r.t. y, is continuously invertible, so that the implicit function theorem [51] ensures that y(u) is continuously differentiable in a neighborhood of a solution of (3.2). To calculate the gradient of the reduced cost functional, we now derive the necessary adjoint and gradient equations using a Lagrange approach. We introduce an adjoint variable \(p \in Z\) for the state equation (3.2) and set up a Lagrangian function \(L:Y\times U \times Z \rightarrow \mathbb {R}\) corresponding to (3.1) as follows:

$$\begin{aligned} L(y, u, p) = J(y, u) + \left\langle e(y, u) , p \right\rangle _{Z^*, Z}. \end{aligned}$$
(3.4)

If we insert \(y = y(u)\) into the Lagrangian, the PDE constraint vanishes and we have

$$\begin{aligned} L(y(u), u, p) = J(y(u), u) + \left\langle e(y(u), u) , p \right\rangle _{Z^*, Z} = \hat{J}(u) \quad \text { for all } p \in Z. \end{aligned}$$

Hence, differentiating the Lagrangian at (y(u), up) w.r.t. u yields

$$\begin{aligned} \left\langle \hat{J}'(u) , h \right\rangle _{U^*, U} = \left\langle \frac{\mathrm{d}}{\mathrm{d}u} L(y(u), u, p) , h \right\rangle _{U^*, U} = \left\langle L_y(y(u), u, p) , y'(u)[h] \right\rangle _{Y^*, Y} + \left\langle L_u(y(u), u, p) , h \right\rangle _{U^*, U}. \end{aligned}$$
(3.5)

The main idea of the adjoint approach is now to choose \(p = p(u) \in Z\) so that \(L_y(y(u), u, p) = 0\), i.e.,

$$\begin{aligned} \left\langle L_y(y(u), u, p) , q \right\rangle _{Y^*, Y} = 0 \quad \text { for all } q\in Y. \end{aligned}$$

This equation can be rewritten as follows:

$$\begin{aligned} 0 = \left\langle L_y(y(u), u, p) , z \right\rangle _{Y^*, Y}&= \left\langle J_y(y(u), u) , z \right\rangle _{Y^*, Y} + \left\langle e_y(y(u), u)[z] , p \right\rangle _{Z^*, Z} \\&= \left\langle J_y(y(u), u) + e^*_y(y(u), u)[p] , z \right\rangle _{Y^*, Y}, \end{aligned}$$

and we note that the use of the adjoint operator \(e^*_y\) gives the adjoint approach its name. In particular, the above equation can be interpreted as

$$\begin{aligned} \left\{ \quad \begin{aligned}&\text {Find } p \in Z \text { such that } \\&\quad \left\langle e_y^*(y(u), u)[p] , z \right\rangle _{Y^*, Y} = - \left\langle J_y(y(u), u) , z \right\rangle _{Y^*, Y}. \end{aligned} \right. \end{aligned}$$
(3.6)

As for the state equation, we assume that the adjoint equation (3.6) is well posed, and the corresponding solution \(p = p(u)\) is called the adjoint state w.r.t. u. Furthermore, it is worth noticing that the adjoint equation (3.6) is always a linear equation, usually simplifying its numerical solution significantly compared to the possibly nonlinear state equation. Inserting \(p=p(u)\) into (3.5) reveals

$$\begin{aligned} \begin{aligned} \left\langle \hat{J}'(u) , h \right\rangle _{U^*, U}&= \left\langle L_y(y(u), u, p(u)) , y'(u)[h] \right\rangle _{Y^*, Y} + \left\langle L_u(y(u), u, p(u)) , h \right\rangle _{U^*, U} \\&= \left\langle L_u(y(u), u, p(u)) , h \right\rangle _{U^*, U} \\&= \left\langle J_u(y(u), u) , h \right\rangle _{U^*, U} + \left\langle e_u(y(u), u) [h] , p(u) \right\rangle _{Z^*, Z} \\&= \left\langle J_u(y(u), u) + e_u^*(y(u), u) [p(u)] , h \right\rangle _{U^*, U}. \end{aligned} \end{aligned}$$
(3.7)

However, (3.7) only specifies the derivative of the reduced cost functional, i.e., an element of \(U^*\). In case that U is a Hilbert space, we can use the Riesz representation theorem to identify this with the gradient of the reduced cost functional, which is then given by

$$\begin{aligned} \hat{J}'(u) = J_u(y(u), u) + e_u^*(y(u), u) [p(u)] \in U. \end{aligned}$$

In summary, to compute the gradient of the reduced cost functional (3.3) for a given \(u\in U\), we solve the state equation (3.2) to obtain the state variable y(u) and then solve the adjoint equation (3.6) to determine the adjoint variable p(u). If U is a Hilbert space, we compute the gradient \(\hat{J}'(u)\) using a Riesz projection as discussed above.

3.2 Automated derivation of adjoint and gradient equations

From our recapitulation of the adjoint approach, we see that to derive the adjoint and gradient equations (3.6) and (3.7), we need to differentiate the cost functional and state equation w.r.t. the state y and the control u and to calculate the corresponding adjoint operators. For many optimization problems with complex, coupled, and nonlinear PDE constraints, it is often impossible to verify all of the necessary assumptions of the adjoint approach. Moreover, even assuming that all objects are sufficiently smooth and calculating the adjoint and gradient equations in a formal manner is not feasible for very complex PDE constraints as it involves extremely tedious and error-prone calculations [53]. Our reactor models from Sect. 2 certainly belong to this category of PDE constraint. For these reasons, we use our software package cashocs [30], which offers a numerical implementation of the continuous adjoint approach based on the finite element software FEniCS [47, 48]. In particular, it uses the built-in automatic differentiation capabilities of FEniCS to derive the corresponding adjoint and gradient equations.

In FEniCS, the cost functional and the PDE constraint of many optimization problems can be represented as variational forms in the so-called Unified Form Language (UFL). These forms can be differentiated automatically and symbolically with respect to their arguments, as explained in [48, Chap. 17]. To derive the corresponding variational forms of the adjoint and gradient equations, cashocs sets up a UFL form for the Lagrangian as in (3.4) which is subsequently differentiated symbolically w.r.t. the state and control variables [30]. Finally, the gradient \(g \in U\) of the reduced cost functional w.r.t. some Hilbert space U is computed numerically by solving the Riesz problem

$$\begin{aligned} \left\{ \quad \begin{aligned}&\text {Find } g \in U \text { such that } \\&\quad \left( g , h \right) _{U} = \left( \hat{J}'(u) , h \right) _{U} = \left( L_u(y(u), u, p(u)) , h \right) _{U} \quad \text { for all } h \in U, \end{aligned} \right. \end{aligned}$$

where \(\left( \cdot , \cdot \right) _{U}\) denotes the scalar product of U. Throughout this paper, we only consider the case \(U=L^2(\varOmega )\) or \(U=L^2(\varGamma ^\text {out})\), in particular, we use the \(L^2\) scalar product to identify the gradients of the cost functionals.

We remark that the two differentiation operations described above are the only places where automatic differentiation is used in cashocs, the rest of the package, which manages the optimization problem, as well as the optimization algorithms do not rely on automatic differentiation or pre-existing implementations. Note, that if we discretize the state problem with a Galerkin method that uses the continuous variational formulation, as we do for the models considered in this paper, our software cashocs computes the corresponding continuous variational forms of the adjoint and gradient equations, which are only discretized at a later stage. In this case, cashocs consistently discretizes the continuous adjoint approach we recalled previously [30]. Finally, we note that there are software packages that use similar ideas to derive adjoint equations utilizing the AD capabilities of the UFL, e.g., dolfin-adjoint [54] or firedrake [55, 56]. However, we do not use these packages as our own approach can be tailored better to our needs and gives us complete control over the optimization algorithms used for solving the corresponding problems.

4 Identification of kinetic reaction parameters

In this section, we determine the kinetic parameters needed to complete our models of the reactor (cf. Sect. 2.4) from the experimental results reported in [11, 29]. We introduce a parameter identification problem based on our one-dimensional reactor model which is solved numerically utilizing our software package cashocs described in Sect. 3. Finally, we compare our two reactor models from Sect. 2 numerically and see that both yield nearly identical results.

4.1 Description of the parameter identification problem

In [11, 29], the authors consider a different model for the reactor with the following major variations to our models from Sect. 2. They consider a heterogeneous reaction only occurring in a porous catalyst located close to the channel wall, whereas we consider a homogeneous reaction. Moreover, they assume that the reaction starts immediately at the inlet of the computational domain, whereas we include the inlet section \(\varOmega ^\text {in}\) where no reaction occurs. For these reasons, we cannot use the kinetic parameters reported in [11, 29] but have to determine appropriate ones for our models ourselves.

As remarked in Sect. 2.3, we restrict our investigation to the case of 10 bar for the operating pressure of the reactor. For this setting, a total of 21 experiments, considering 7 different reactor temperatures \(T^\text {wall}\) (\(250 {-}400\,^{\circ }\hbox {C}\) with increments of \(25\,^{\circ }\hbox {C}\)) and 3 different inlet flow rates (\(50\,\hbox {mL/min}, 100\,\hbox {mL/min}\), and \(150\,\hbox {mL/min}\)), were carried out in [11, 29]. Note, that here and throughout the rest of this paper, when we specify flow rates of the gas, we always assume normal conditions given by a pressure of 1 atm and a temperature of 273.15 K so that we can compare them regardless of the physical conditions. We remark that the experimental results of [11, 29] are given by measurements of the achieved \(\hbox {CO}_{2}\) conversion, and that we extracted the corresponding numerical values using the software Webplot Digitizer [57].

To determine the kinetic parameters, we consider the following parameter identification problem:

$$\begin{aligned} \min _{y, u_\mathrm{c}}\ J(y, u_\mathrm{c}) = \sum _{l=1}^{21} \frac{1}{2} \int _{\varGamma ^\text {out}} \left( \chi ^{\hbox {CO}_{2}}_{\text {sim}, \mathrm{l}} - \chi ^{\hbox {CO}_{2}}_{\text {exp}, \mathrm{l}} \right) ^2 \ \text {d}s \quad \text { s.t. } \quad e(y, u_\mathrm{c}) = 0 \quad \text { and } \quad u_\mathrm{c} \in U_\text {ad}, \end{aligned}$$
(4.1)

where the state variables are combined in the vector \(y = [p, \varvec{u}, T, \varvec{Y^{\text {vec}}}]^\top \) and the control variables are combined in the vector \(u_\mathrm{c} = [E_{\text {a}}, \log (A), n]^\top \). Note, that we use the logarithm of the pre-exponential factor, i.e., \(\log (A)\), instead of \(A\) as control variable since this ensures a better scaling of the parameters and, additionally, guarantees that the computed pre-exponential factor is positive. Moreover, we assume that \(u_c\) is constant in \(\varOmega \), i.e., that the kinetic parameters do not vary spatially. This yields a finite-dimensional optimization problem, so that we do not require additional regularization for the cost functional. Furthermore, \(\chi ^{\hbox {CO}_{2}}_{\text {exp}, \mathrm{l}}\) denotes the \(\hbox {CO}_{2}\) conversion measured in experiment l, and \(\chi ^{\hbox {CO}_{2}}_{\text {sim}, \mathrm{l}}\) is the \(\hbox {CO}_{2}\) conversion of the lth simulation, which are calculated via (2.6). The operator for the state system \(e(y, u_\mathrm{c})\) is given by

$$\begin{aligned} e(y, u_\mathrm{c}) = [e_1(y, u_\mathrm{c}), \ldots , e_{21}(y, u_\mathrm{c})]^\top , \end{aligned}$$

where \(e_l(y, u_\mathrm{c}) = 0\) is the weak form of the state system (2.9) with appropriate values for \(\varvec{u}^\text {in}\) and \(T^\text {wall}\) corresponding to the lth experiment, as discussed above. Finally, the set of admissible controls is given by

$$\begin{aligned} U_\text {ad} = \{[E_{\text {a}}, \log (A), n]^\top \in \mathbb {R}^3 | n\ge 0\}, \end{aligned}$$

which models that the empirical exponent is supposed to be non-negative. Note, that the cost functional in (4.1) corresponds to a least-squares problem for fitting the \(\hbox {CO}_{2}\) conversion to the experimental results.

4.2 Numerical results

Fig. 3
figure 3

History of the numerical optimization

We solve the parameter identification problem (4.1) numerically using our software cashocs described in Sect. 3. For the discretization of the adjoint system, we choose analogous finite elements as for the discretization of the state system (cf. Sect. 2.5). Moreover, we use a projected limited memory BFGS (L-BFGS) method with a memory of 5 vectors as optimization algorithm. For the computation of the step size, we use an Armijo line search with initial guess one for the step size. Moreover, we restart the algorithm with a gradient step in case the respective curvature condition for the BFGS method is not satisfied. The optimization algorithm is terminated once a relative tolerance of 1e−6 for the stationarity measure is reached. Our initial guess for the kinetic parameters is based on the parameters computed in [11, 29] and is given by \(E_{\text {a}}= 65\,\hbox {kJ/mol}\), \(\log (A)= 12\), and \(n= 0.222\). We refer the reader, e.g., to [58, 59] for a detailed description of the algorithm and to [60] for its application in the context of PDE-constrained optimization.

The history of the cost functional and the relative stationarity measure over the course of the optimization are shown in Fig. 3. We observe that the algorithm terminates successfully after 48 iterations, suggesting that we find a local minimizer (or stationary point) of (4.1). Since one solve of the state system requires 21 solves of (2.9), one for each experiment, this amounts to a total of 1113 solves for (2.9), and 1008 solves for the corresponding adjoint system over the course of the optimization. Note that the additional solves needed for the state system are a consequence of the Armijo line search. In Fig. 4, the simulated and measured \(\hbox {CO}_{2}\) conversions are depicted for all 21 experimental settings for the identified kinetic parameters. We see that our model shows excellent agreement to the experimental results. The largest difference between simulated and measured \(\hbox {CO}_{2}\) conversion is about 7%, and the mean error is around 1.9%, which is well within the reproducibility of the experiment of 5% from [11, 29] and closer than the original fit proposed there. For these reasons, we conclude that our model is able to simulate the physical and chemical processes in the reactor sufficiently well so that we can use it in Sect. 5 to optimize the reactor.

Fig. 4
figure 4

Simulated and measured \(\hbox {CO}_{2}\) conversion for the experiments from [11, 29]

Table 2 Kinetic parameters determined by the parameter identification

We also remark that for the parameters identified above, the solution shows a nearly isothermal behavior, in particular, the difference between the simulated gas temperature and the wall temperature is well below \(1\,^\circ \hbox {C}\) at all points in the reactor. Hence, we could also use an isothermal model for the reactor, where the temperature is not treated as a state variable, but directly prescribed as the wall temperature, simplifying the numerical simulations. However, as this behavior cannot be known before the determination of the kinetic parameters and since other sets of parameters could yield a non-isothermal behavior, we keep the temperature as a state variable.

Finally, the identified parameters are shown in Table 2. We remark that they are in a good agreement with the kinetic parameters found in [6], where the empirical exponent \(n\) and the activation energy \(E_{\text {a}}\) were found to be 0.076 and \(65.2\,\hbox {kJ/mol}\). Since we consider a higher operating pressure, a different type of catalyst, and also use a slightly different model to the ones in [6], the minor deviations for the identified parameters are justified.

4.3 Numerical comparison of the reactor models

To conclude this section, we briefly compare the one-dimensional model (2.9) to the three-dimensional model (2.3). For this, we simulate all of the 21 test cases of [11, 29], using the kinetic parameters obtained previously, with both models and compute the corresponding relative errors between the models in the \(L^\infty (\varOmega ), L^2(\varOmega )\), and \(L^1(\varOmega )\) norms.

Fig. 5
figure 5

Relative errors between the 1D and 3D model

To compare the models to each other, we average the results of the three-dimensional model over the channel cross section, which can then be directly compared to the results of the one-dimensional model. The arising errors are shown in Fig. 5 in the form of a box plot. We see that the relative error between both models is well below 1% for all test cases and variables. The highest relative errors are obtained for the pressure with about 0.2%, all other variables have relative errors even below 1 ‰. This is the case since the physical and chemical behavior in a single channel of the reactor is basically one dimensional, with variations along the channel length being significantly larger than variations along its cross section. Finally, we note that the relative errors for the temperature are very small even compared to the small errors of the other variables. This is due to fact that we use the Kelvin scale to compute the relative errors for the temperature as this is a ratio scale. These results validate our approach to use the 1D porous medium model for computing the necessary kinetic parameters for the models instead of the more costly three-dimensional one since both yield basically identical results.

5 Optimization of the microreactor

In this section, we discuss the optimization of the microchannel reactor under consideration by means of solving PDE-constrained optimal control problems. Throughout this section, we use the one-dimensional model (2.9), which models the behavior of the reactor as validated in the previous section, as corresponding state system for the optimization problems. We consider two optimization problems for improving the reactor’s performance which we solve numerically using our software package cashocs described in Sect. 3. We discuss the obtained results as well as their applicability and realizability.

5.1 Optimization with fixed flow rates

For the first problem, we keep the inlet flow rate fixed at the respective values considered in [11, 29]. Our goal is to maximize the \(\hbox {CO}_{2}\) conversion of the reactor, since this corresponds to maximizing the product yield for a fixed flow rate, by using the wall temperature \(T^\text {wall}\) as optimization variable, which is detailed below. To model this, we consider the following optimization problem:

$$\begin{aligned} \min _{y, T^\text {wall}}\ J(y, T^\text {wall}) = \frac{1}{2} \int _{\varGamma ^\text {out}} \left( \chi ^{\hbox {CO}_{2}} - 1 \right) ^2 \ \text {d}s \quad \text { s.t. } \quad e(y, T^\text {wall}) = 0 \quad \text { and } \quad T^\text {wall}\in U_\text {ad}. \end{aligned}$$
(5.1)

As in Sect. 4, the state variables are combined into the vector \(y = [p, \varvec{u}, T, \varvec{Y^{\text {vec}}}]^\top \) and the \(\hbox {CO}_{2}\) conversion \(\chi ^{\hbox {CO}_{2}}\) is defined as in (2.6). Moreover, the wall temperature \(T^\text {wall}\) now plays the role of the control variable, and the state system \(e(y, T^\text {wall}) = 0\) is given by the corresponding weak form of (2.9), with the respective inlet conditions for the velocity. Note, that we use a tracking-type cost functional which aims at bringing the \(\hbox {CO}_{2}\) conversion as close as possible to 1 on \(\varGamma ^\text {out}\). Note, that this case, i.e., \(\chi ^{\hbox {CO}_{2}} = 1\), corresponds to the total conversion of \(\hbox {CO}_{2}\) and, therefore, to the maximum product yield. We remark that we tested several alternative cost functions that also have the goal of maximizing the reactor’s yield and found that all of them gave nearly identical results in practice. Hence, it is justified that we restrict ourselves to the cost functional given in (5.1). Finally, the set of admissible controls \(U_\text {ad}\) is given by

$$\begin{aligned} U_\text {ad} = \{T^\text {wall}\in L^2(\varOmega ) | 180\,^\circ \hbox {C} \le T^\text {wall}\le 600\,^\circ \hbox {C} \text { a.e. in } \varOmega \}, \end{aligned}$$

i.e., we have box constraints for the wall temperature. Due to these box constraints, we do not require additional regularization terms for the optimization problem (5.1). Since the model is quasi-isothermal (cf. Sect. 4.2), this constraint also restricts the temperature of the gas mixture. Note, that the lower bound of \(180\,^\circ \hbox {C}\) corresponds to the boiling point of water at a pressure of 10 bar, so that we do not have to consider any phase change effects. The upper bound for the temperature is chosen so that we can neglect the effect of the RWGS reaction and the corresponding \(\hbox {{CO}}\) production (cf. Sect. 2.3).

For the wall temperature, we consider the following four models. First, we use a constant value for \(T^\text {wall}\) throughout the entire reactor, which corresponds to an isothermal reactor. Second, we divide \(\varOmega ^\text {reac}\) into two halves along its length and restrict \(T^\text {wall}\) to be constant in each half, which models a reactor consisting of two isothermal stages. Third, we proceed analogously to before, but now divide \(\varOmega ^\text {reac}\) into thirds, which corresponds to a three-stage isothermal reactor. For the final model, we do not specify any sort of profile and consider \(T^\text {wall}\) to be an arbitrary function in \(L^2(\varOmega )\), which we discretize with piecewise linear Lagrangian finite elements. We refer to these models as constant, two-stage, three-stage, and distributed model, respectively, throughout the rest of this paper. These models exhibit different levels of complexity, particularly regarding their realizability and capability of controlling the reaction, which we discuss in Sect. 5.3. In particular, we note that while these models may only be approximately realizable in practice, they still serve as benchmark for the optimization by showcasing the theoretical limitations of the reactor.

Table 3 Number of solves for state/adjoint system for problem (5.1)
Fig. 6
figure 6

Optimized temperature profiles for problem (5.1)

We solve the optimization problem (5.1) numerically for all four models and the three inlet flow rates of 50 mL/min, 100 mL/min, and 150 mL/min, as considered in [11, 29], using our software package cashocs (cf. Sect. 3). For the corresponding adjoint system we use an analogous discretization to the one for the state system (cf. Sect. 2.5). As optimization algorithm we again use the projected L-BFGS method described in Sect. 4.2, which is terminated once the relative stationarity measure reaches a tolerance of 1e−4. The required amount of PDE solves for the state and adjoint systems for the optimization algorithm are tabulated in Table 3. Note, that the number of iterations of the algorithm corresponds to the number of solves for the adjoint system minus one, as we need an additional gradient computation for the termination criterion. We observe that the algorithm is very efficient as it needs less than 20 iterations to reach the desired tolerance for the stationarity measure in all cases. Since the number of solves for the state system exceeds the ones for the adjoint system only for one case and one additional solve, we observe that the initial step length is almost always accepted by the algorithm. We note that with increasing complexity of the models, we need slightly more iterations to reach the desired tolerance which is, however, to be expected.

The optimized wall temperatures are depicted in Fig. 6 together with the corresponding \(\hbox {CO}_{2}\) conversions. Comparing the \(\hbox {CO}_{2}\) conversions with the ones from the experiments (cf. Sect. 4) we recognize that we achieve considerably higher levels of \(\hbox {CO}_{2}\) conversion for all flow rates by optimizing the wall temperature. Moreover, by comparing the achieved \(\hbox {CO}_{2}\) conversions with the equilibrium conversions (cf. Fig. 2a), we can also conclude that the thermodynamic equilibrium is a limiting factor in all cases as the difference between the obtained and the equilibrium \(\hbox {CO}_{2}\) conversions are only about 1–2% for the respective outlet temperatures. We observe that the optimized wall temperatures are at their largest near the inlet of the channel and then decrease monotonically along its length. This yields a large reaction rate at the beginning of the reactor as well as more favorable thermodynamic equilibria towards its end, which are actually limiting the reaction as discussed above. Moreover, note that the optimized wall temperatures increase monotonically with the flow rate, due to the following. As a higher flow rate corresponds to a lower residence time of the gas in the reactor, a higher reaction rate is needed to increase the \(\hbox {CO}_{2}\) conversion, which can be achieved through a higher wall temperature (cf. Fig. 2c). This confirms our deliberations from Sect. 2.3 regarding the dependence of the thermodynamic equilibrium and the reaction rate on the temperature w.r.t. the optimization of the reactor.

Analyzing the obtained \(\hbox {CO}_{2}\) conversions more closely, we first observe that the \(\hbox {CO}_{2}\) conversion increases monotonically with the complexity of the wall temperature models, i.e., the constant model performs worst for all flow rates, the two- and three-stage models perform increasingly better, and the distributed model has the highest conversion of all profile types for each flow rate. In particular, the increase in conversion is roughly 2% between the constant and two level profiles and then only about 0.5% each for the successive models. Hence, we conclude that the ability to use a lower temperature towards the end of the reactor can have an important effect on the quality of the reaction, but using more sophisticated temperature profiles does not add a great amount of additional improvements on top of this. Moreover, we observe that the optimized conversion also decreases monotonically with increasing flow rate. This is due to the fact that for higher flow rates, a larger temperature has to be used to obtain a sufficiently high reaction rate, as mentioned previously. This then leads to worse thermodynamic equilibria and explains the overall lower \(\hbox {CO}_{2}\) conversion.

Table 4 Molar product yield in (mol/s) for problem (5.1)

Finally, let us investigate the obtained product yield which is shown in Table 4. There, we consider the total amount of product, i.e., \(\hbox {CH}_{4}\) and \(\hbox {H}_{2}\hbox {O}\) that is generated by the reactor and, due to the stoichiometry of the reaction, \(\nicefrac {1}{3}\) of this amount is \(\hbox {CH}_{4}\) and \(\nicefrac {2}{3}\) is \(\hbox {H}_{2}\hbox {O}\). Obviously, for a fixed flow rate a higher conversion leads to a higher molar product yield, so that the yield increases monotonically with the model complexity. Moreover, for the conditions under consideration, we observe that the molar yield is almost linear in the flow rate for all models. This can be explained by the fact that the achieved conversions are all rather close to each other for all models and flow rates. Therefore, the slightly lower \(\hbox {CO}_{2}\) conversion obtained at higher flow rates is compensated by the higher resulting throughput, leading to overall higher product yields for higher flow rates. These results indicate that one should try to use the highest possible flow rate so that the resulting quality of the product, measured by the \(\hbox {CO}_{2}\) conversion, is still sufficiently high. We take a more detailed look at this problem of balancing the molar yield and product quality of the reactor in the following section.

5.2 Optimization with variable flow rates and state constraints

As indicated above, our goal is now to maximize the molar yield of the reaction by considering the wall temperature and the inlet flow rate as control variables, where the product quality is ensured by means of a state constraint on the \(\hbox {CO}_{2}\) conversion. In particular, we consider the following optimization problem:

$$\begin{aligned} \min _{y, u_\mathrm{c}}\ J(y, u_\mathrm{c}) = -\int _{\varGamma ^\text {out}} \rho \varvec{u}\cdot \varvec{n}\ \text {d}s \quad \text { s.t. } \quad e(y, u_\mathrm{c}) = 0, \quad y\in Y_\text {ad}, \quad \text { and } \quad u_\mathrm{c} \in U_\text {ad}. \end{aligned}$$
(5.2)

Analogously to before, the vector of state variables is given by \(y = [p, \varvec{u}, T, \varvec{Y^{\text {vec}}}]^\top \) and the vector of control variables is given by \(u_\mathrm{c} = [\varvec{u}^\text {in}, T^\text {wall}]^\top \), i.e., now both the inlet flow rate and the wall temperature act as control variables. The state system \(e(y, u_\mathrm{c}) = 0\) is, again, given by the corresponding weak form of the one-dimensional porous medium model (2.9). The cost functional J measures the negative outgoing mass flow rate at the outlet of the reactor. Note, that thanks to the sign in J, the minimization of this cost functional is equivalent to maximizing the outgoing mass flow rate. The set of admissible controls is defined as follows:

$$\begin{aligned} U_\text {ad} = \{[\varvec{u}^\text {in}, T^\text {wall}]^\top \in \mathbb {R}\times L^2(\varOmega ) | u_\mathrm{a} \le \varvec{u}^\text {in}\le u_\mathrm{b} \text { on } \varGamma ^\text {in}\, \text { and } \, 180\,^\circ \hbox {C} \le T^\text {wall}\le 600\,^\circ \hbox {C} \text { a.e. in } \varOmega \}, \end{aligned}$$

where the box constraints for \(T^\text {wall}\) are the same as in the previous section. The constraints for the inlet flow rate are motivated by the physical limitations of the reactor, i.e., we cannot prescribe arbitrarily small or large flow rates as the reactor is not stable for such conditions. Note, that the particular choice of \(u_\mathrm{a}\) and \(u_\mathrm{b}\) is detailed below. Additionally, as we use the one-dimensional model of the reactor, the inlet velocity \(\varvec{u}^\text {in}\) corresponds to the averaged inlet flow rate, and is, thus, a scalar quantity. The state constraint is modeled by the set of admissible states \(Y_\text {ad}\) which is given by

$$\begin{aligned} Y_\text {ad} = \left\{ [p, \varvec{u}, T, \varvec{Y^{\text {vec}}}]^\top | \chi ^{\hbox {CO}_{2}} = 1 - \frac{Y_{\mathrm{CO}_{2}}}{Y^\text {in}_{\mathrm{CO}_{2}}} \ge \chi ^{\text {des}}\text { on } \varGamma \right\} , \end{aligned}$$

i.e., the state constraint is given by the requirement that the \(\hbox {CO}_{2}\) conversion is larger than a given desired conversion \(0 \le \chi ^{\text {des}}< 1\). Obviously, this can be easily transformed into a more classical form of a state constraint, namely \(Y_{\mathrm{CO}_{2}} \le C\) for some appropriate constant \(C \in (0,1)\). As we have seen in Sect. 5.1, the maximum achievable \(\hbox {CO}_{2}\) conversion decreases monotonically with the flow rate. Therefore, we conclude that for the maximum possible flow rate, the state constraint is satisfied with equality. Since the molar product yield of the reactor solely depends on the mass flow rate \(\rho \varvec{u}\) and the gas composition, which is restricted to the desired conversion as just discussed, the solution of problem (5.2) maximizes the product yield of the reactor for a given \(\hbox {CO}_{2}\) conversion \(\chi ^{\text {des}}\).

Finally, we note that problem (5.2) is a Dirichlet boundary control problem for the control variable \(\varvec{u}^\text {in}\). For the numerical solution of this problem, we proceed analogously to [61, Chap. 10.6], i.e., we treat the Dirichlet boundary condition with the help of an appropriately chosen Lagrange multiplier that consists of terms involving the respective adjoint variables. These terms correspond to the ones arising from the integration by parts carried out for the derivation of the variational formulation of the PDE. This allows us to include the boundary condition in the variational form of the problem, so that we can solve it with our software cashocs. For more details regarding Dirichlet boundary control problems we refer the reader, e.g., to [51].

For the numerical solution of problem (5.2), we treat the state constraint with a Moreau–Yosida regularization [51] which leads to the following regularized optimization problem:

$$\begin{aligned} \begin{aligned} \min _{u_\mathrm{c}}\ J_\gamma (y, u_\mathrm{c})&= -\int _{\varGamma ^\text {out}} \rho \varvec{u}\cdot \varvec{n}\ \text {d}s + \frac{1}{2\gamma } \int _{\varGamma ^\text {out}} \max \left( 0, \hat{\mu } + \gamma \left( \chi ^{\text {des}}- \chi ^{\hbox {CO}_{2}} \right) \right) ^2 \ \text {d}s \\&\text { s.t. } \quad e(y, u_\mathrm{c}) = 0 \quad \text { and } \quad u_\mathrm{c} \in U_\text {ad}. \end{aligned} \end{aligned}$$
(5.3)

Here, the state and control variables as well as the state system and \(U_\text {ad}\) are defined exactly as for problem (5.2). The cost functional \(J_\gamma \) now includes the Moreau–Yosida regularization of the state constraint with penalty parameter \(\gamma > 0\) and shift function \(\hat{\mu }\). However, we only consider the case \(\hat{\mu } = 0\) in this paper. The idea of the Moreau–Yosida regularization is to solve a sequence of regularized optimization problems (5.3) with increasing values of \(\gamma \) with \(\gamma \rightarrow \infty \) so that we obtain a solution of the state constrained problem (5.2) in the limit.

For the wall temperature, we consider the same four models as in the previous section, i.e., the constant, two- and three-stage, and distributed model. Again, the models might only be approximately realizable in practice but still act as benchmark for the optimization. For the state constraint, we use the following values for the desired \(\hbox {CO}_{2}\) conversion:

$$\begin{aligned} \chi ^{\text {des}}\in \{0.85, 0.875, 0.9, 0.925, 0.95\}. \end{aligned}$$

Additionally, the box constraints for the velocity are always chosen with \(u_\mathrm{a}\) corresponding to 50 mL/min, i.e., the lowest flow rate considered in the experiments in [11, 29], and with \(u_\mathrm{b}\) corresponding to a flow rate of at least 150 mL/min, which is explained more detailedly below. Note, that for a flow rate of 50 mL/min, the obtained optimized temperature profiles from Sect, 5.1 all have \(\hbox {CO}_{2}\) conversions of over 95% (cf. Fig. 6), which directly makes the corresponding controls and states feasible for problem (5.2) for all considered values of \(\chi ^{\text {des}}\), so that \(U_\text {ad}\) and \(Y_\text {ad}\) are, in fact, non-empty and we do not minimize over the empty set.

For the numerical solution of (5.2), we use a homotopy approach, i.e., we solve (5.3) for one value of \(\gamma \) and then use the obtained solution as initial guess for (5.3) with the next higher value of \(\gamma \). In particular, for all three piecewise constant temperature models, we use the sequence \(\gamma _l = 10^l, l=0,\ldots ,6\), and for the distributed temperature model we use the sequence \(\gamma _l = 7^l, l=0,\ldots ,7\), as the corresponding problem is slightly harder to solve numerically. We solve the subproblems (5.3) by means of a projected L-BFGS method with memory size 5 and a stopping tolerance of 1e−2 for the relative stationarity measure, as discussed in Sect. 4.2. Again, the adjoint system is discretized analogously to the state system (cf. Sect. 2.5). The value of the upper bound for the inlet velocity \(u_\mathrm{b}\) is chosen as follows. We start with a value of \(u_\mathrm{b}\) corresponding to 150 mL/min and solve the optimization problem (5.2) as discussed above. If the resulting optimized inlet velocity is located at the upper bound \(u_\mathrm{b}\), we increase the bound by 50 mL/min and repeat this procedure, until the optimized flow rate is not at the upper bound anymore. Since the maximum possible \(\hbox {CO}_{2}\) conversion decreases monotonically with the flow rate as discussed previously, it is straightforward to see that this indeed yields the maximum flow rate. Note that we do not use a fixed and sufficiently large value for \(u_\mathrm{b}\) for all settings as we cannot solve all of the corresponding problems numerically due to the following. For the initial low values of \(\gamma \), the solution to (5.3) is given by \(\varvec{u}^\text {in}= u_\mathrm{b}\) as the regularization term for the state constraint has only a minor influence on the problem. The higher the value of \(\gamma \), the larger this influence becomes and the smaller the flow rate has to be to satisfy the state constraint. This, however, leads to very bad initial guesses for the subsequent problem with a larger penalty parameter, making the numerical solution of (5.3) very hard or even impossible, so that the whole approach of regularizing the state constraint becomes pointless in this case.

Before we discuss the numerical results, we remark that the absolute difference between the \(\hbox {CO}_{2}\) conversion obtained by solving (5.2) numerically and the desired conversion \(\chi ^{\text {des}}\) is less than 0.01% in all considered cases, which is sufficiently accurate for our purposes and shows that we indeed compute feasible solutions to the original problem (5.2). The required amount of solves for the state and adjoint system for the numerical solution of (5.2) is shown in Table 5. Since we now solve a sequence of optimization problems, the required number of PDE solves obviously increases compared to the problems investigated in Sect. 5.1. We also observe that we need significantly more solves for the state system than for the adjoint system which indicates that the step size computation via the Armijo rule takes more effort. On average, we need about two to three times as many solves for the state system compared to the number of solves for the adjoint system. As before (cf. Table 3), we observe that we need roughly the same amount of iterations to solve problems with the piecewise constant temperature models, and that we need considerably more iterations to solve the problems for the distributed model. Finally, we note that the number of PDE solves increases with increasing \(\chi ^{\text {des}}\) for almost all cases, which indicates that the problems become increasingly harder to solve for higher values of \(\chi ^{\text {des}}\).

Table 5 Number of solves for state/adjoint system for problem (5.2)
Fig. 7
figure 7

Optimized temperature profiles for problem (5.2) with \(\chi ^{\text {des}}= 0.9\) and \(\chi ^{\text {des}}= 0.95\)

Let us first investigate the qualitative behavior of the temperature profiles and flow rates, which are depicted in Fig. 7 for a desired conversion of \(\chi ^{\text {des}}= 0.9\) and \(\chi ^{\text {des}}= 0.95\). We observe that the optimized temperature profiles look very similar to the ones investigated in the previous section (cf. Fig. 6). Again, the temperature is higher at the beginning of the reactor and then decreases monotonically along its length for all cases. As expected after our previous investigation, the optimized flow rate increases considerably when using more complex temperature profiles. Moreover, we observe that the largest increase in flow rate happens between the constant and two stage temperature models, whereas increasing the model complexity further has only a smaller influence on the obtained flow rate. For all models, the optimized temperature profiles and flow rates are lower for a higher desired conversion \(\chi ^{\text {des}}\). This can be explained through the thermodynamically favorable conditions at lower temperatures and the higher residence times induced by lower flow rates which both lead to an increase in \(\hbox {CO}_{2}\) conversion (cf. Sect. 2.3).

Table 6 Numerical results for the optimization problem (5.2)

The optimized flow rates and absolute product yields for all choices of \(\chi ^{\text {des}}\) are listed in Table 6. Table 6a confirms our previous findings, as we see that the optimized flow rate increases with increasing complexity of the temperature models in all cases. As before, we also observe that the biggest difference is between the constant and the two stage model, where the flow rate increases by about 50 mL/min, and additional improvements from the three-stage and distributed model are smaller, with about 10 mL/min each. Furthermore, we observe that the optimized flow rate decreases rapidly when the \(\hbox {CO}_{2}\) conversion is increased. This indicates that higher \(\hbox {CO}_{2}\) conversions are considerably harder to achieve than lower ones. Considering the total molar product yield, which is depicted in Table 6b, we obviously have a similar behavior regarding the complexity of the temperature models as for the flow rate, since the \(\hbox {CO}_{2}\) conversion is restricted by the state constraint for all models. Finally, concerning the influence of the state constraint, it is interesting to observe that the molar product yield decreases with an increase of the desired conversion \(\chi ^{\text {des}}\). In particular, the higher achieved level of \(\hbox {CO}_{2}\) conversion cannot compensate the necessary reduction in the flow rate, so that overall less product is formed. This implies that one should consider optimizing the flow rate for a desired conversion \(\chi ^{\text {des}}\) as closely as possible to the actual desired product quality since using higher values of \(\chi ^{\text {des}}\) than necessary leads to lower product yields.

5.3 Realizability of our results in practice

Usually, chemical reactors are equipped with heat exchanger or cooling systems to control the temperature in the reactor. Regarding the realization of the optimized temperature profiles shown in Figs. 6 and 7 with such systems, we have the following remarks. For a microchannel reactor, an obvious heat exchanger system is one which is based on microchannels, too, as they have large heat transfer coefficients due to their high specific surface area. Moreover, they already have the appropriate dimensions to be coupled to a microchannel reactor. We refer the reader to, e.g., [62, 63] for an overview of such temperature control systems, as well as to our previous work [19, 20], where we considered the shape optimization of such systems. Naturally, for these systems, the question arises, how to align the flow of the coolant with the flow of the chemically reacting gas mixture. Popular choices are given by setups that use co-current, counter-current, or cross-flow configurations [64,65,66].

Let us start by investigating the counter-current configuration. In this setup, the coolant enters the domain around the area where the gas leaves it. It is at its lowest temperature there since it did not absorb a large amount of heat yet. As the coolant traverses its channel, it is constantly heated up by the reaction and becomes warmest near the reactor inlet. Looking at this from the view of the reactor channel, the gas temperature is at its highest point at the inlet and then gets cooled down progressively as it passes through the reactor. From our results (cf. Figs. 6 and 7), we can see that this behavior is desirable, so that a suitably chosen counter-current flow configuration could potentially approximate the temperature profiles from the distributed model. These deliberations also directly imply that the co-current flow configuration should be avoided if possible as this would cool the gas right at the inlet and heat it up as it passes through the reactor, achieving the opposite of our goal.

Finally, the cross-flow configuration could also be used to achieve the desired temperatures in the reactor: by using different coolant temperatures in the channels, corresponding to different sections of the reactor, one can achieve high temperatures around the reactor inlet and low temperatures around its outlet. In particular, if a sufficient amount of cooling channels are used, temperature profiles similar to those obtained by the distributed model could potentially be achieved. Moreover, also the piecewise constant temperature profiles could be, approximately, realizable using a cross-flow configuration by dividing the cooling channels into distinct groups, each with the same coolant temperature. In particular, these deliberations show that our choice of the temperature models for the considered optimization problems is sensible as they can be approximately realized in practice by means of appropriate temperature control systems.

In conclusion, we have seen that optimizing a microchannel reactor with methods from PDE-constrained optimization shows clear potential for considerable improvements of the reactor’s design. In particular, all of our results indicate that optimizing the temperature profile and flow rate of the reactor can increase the obtained product yields considerably. As the considered temperature models are realizable in practice, our results can be readily transferred to the actual application in order to optimize its design. However, we remark that, for this transfer, one also has to consider additional factors, such as energy and cost efficiency, for the particular application at hand.

6 Conclusion and outlook

In this work, we have introduced two mathematical models for chemically reacting fluid flow. First, a three-dimensional model that resolves all quantities over the channel cross section, and second, a one-dimensional model we derived using a homogenization of the reactor over its cross section. We used these to model the Sabatier reaction in a microchannel reactor. The necessary kinetic parameters to compute the reaction rate for these models were determined by means of solving a parameter identification problem constrained by the one-dimensional model using our software package cashocs [30] which implements and automates the adjoint approach for solving PDE-constrained optimization problems. The obtained results show excellent agreement with the corresponding experimental data published in [11, 29]. We compared both models numerically and found that the one-dimensional model is a very good approximation to the three-dimensional one, with relative errors well below 1%. Finally, we investigated two types of optimal control problems aimed at improving the quality of the reactor. First, we considered the wall temperature of the reactor as optimization variable and maximized the \(\hbox {CO}_{2}\) conversion for a fixed flow rate. Second, we considered the inlet flow velocity as additional control variable and maximized the mass flow rate of the reactor subject to a state constraint for the \(\hbox {CO}_{2}\) conversion that ensures a desired quality of the product. The results we obtained for both problems show great potential for optimizing the design of the reactor.

For future work, we propose the extension of our reactor model to include, e.g., more sophisticated reaction mechanisms for the Sabatier process or the RWGS reaction, which would yield a more accurate model. Additionally, it could be worthwhile to investigate other aspects of the reactor, e.g., by considering not only a single channel, but the entire reactor, or by coupling an appropriate temperature control system to the reactor, and to investigate similar optimization problems to the ones in this work in these settings. Another interesting possibility for future research would be to transfer our results into practice and to test them in reality. Finally, it is also of interest to transfer our setting to other reactions or more complex reactor settings, where, e.g., multiple reactions occur consecutively or in parallel, and to investigate their optimization using similar methods to the ones considered in this paper.