1 Introduction and notation

Flow in a two-phase porous medium is governed by the Darcy law [7]

$$\begin{aligned} q=-K(S) \left( \nabla p +\rho g e_d \right) , \end{aligned}$$
(1.1)

where \(e_d=(0,\ldots ,0,1),\) is the direction of gravity. The quantity S is the saturation, p is the pressure, and (the vector) q is the flow velocity of the wetting phase (usually water, while the non-wetting one is oil or a gas).

The Darcy law represents conservation of momentum and, in order to close the system, we also need the conservation of mass

$$\begin{aligned} \partial _t S+ \mathrm {div}q =0. \end{aligned}$$
(1.2)

In the two-dimensional situation, we have three equations, given by (1.1) and (1.2), while we have four unknowns (two velocity components, saturation, and pressure). Therefore, usually one assumes a constitutive relation between the pressure p, the capillary pressure \(P_c\) (equal to differences of pressures between wetting and non-wetting phases), and the saturation S. If it is assumed that the capillary pressure is (almost) constant, one can derive the Buckley–Leverett equation (a scalar conservation law derived in [8]):

$$\begin{aligned} \partial _t S+\partial _x f(S)=0, \end{aligned}$$
(1.3)

for \(f(S)=\frac{S^2}{S^2+A(1-S)^2}\) and an appropriate constant A.

If we assume that \(P_c\) is “static” (independent of the t-derivative of S) in the sense that \(p_c=p_c(S)\), then we arrive at a parabolic perturbation of the Buckley–Leverett equation [7]. However, both of the models (the standard Buckley–Leverett or the one perturbed by a parabolic term) appear to give results inconsistent with certain fairly simple experiments [14]. Namely, if we take a thin tube filled with dry sand and dip it in water at a constant rate at one side of the tube and then measure the concentration of the water as a function of time, for certain dipping rates the concentration will not be monotonic. This phenomenon, called overshoot, had been noticed long ago and is a subeffect of the fingering effect [25]. Namely, the tips of the fingers that appear when water is penetrating into dry land have larger water concentrations than the body of the finger and, interestingly, the tips of the fingers almost do not change their shape or size. On the other hand, an entropy solution to the Buckley–Leverett equation (in the sense of Kruzhkov [33] or more precisely Oleinik [40] since we are dealing with a Riemann problem) or the parabolic perturbation of the Buckley–Leverett equation with the Riemann initial data

$$\begin{aligned} S(0,x)={\left\{ \begin{array}{ll} S_L, &{} x<0\\ 0, &{} x>0 \end{array}\right. } \end{aligned}$$
(1.4)

is monotonic (see, e.g., the introduction of [53]). Therefore, many attempts to explain the gap between the standard theory (provided by the Buckley–Leverett or Richards [45] equations) have been put forth recently. A purely mathematical approach can be found in [27], where \(\delta \)-type solutions to (1.3) are constructed and \(\delta \)-distributions appearing as a part of the solution are explained as an inadequacy of the model. In [13], a model is suggested that involves the notion of non-local energy, which eventually leads to a fourth-order equation whose solution (with the Riemann initial data given above) has a shape corresponding to the experimental results from [14].

In [53] one can find an interesting approach explaining the discrepancy between theory and experiment described above, based on the dynamics capillarity concept introduced in [22, 23], which has drawn a lot of interest (especially after the publication of [53]; see also [1]). Namely, in [22, 23] it was supposed that the capillary pressure depends not only on the saturation S, but also on the time derivative of the saturation (of the wetting phase):

$$\begin{aligned} P_c=p_c(S)-\phi \tau \frac{\partial S}{\partial t}. \end{aligned}$$

Taking this into account and proceeding along the lines of deriving the Buckley–Leverett equation, one reaches a nonlinear pseudo-parabolic equation ([53, (1.16)]) which, after linearization of higher-order terms reduces to

$$\begin{aligned} \partial _t S+\partial _x f(S)=\varepsilon \partial _{xx}S+\tau \delta \partial _{xx} \partial _t S, \end{aligned}$$
(1.5)

where \(\varepsilon \) and \(\delta \) (\(\delta =\varepsilon ^2\) in [53]) are small parameters, while \(\tau \) is a fixed constant. By analyzing possible traveling wave solutions \(S(\frac{x-ct}{\varepsilon })\), \(S(-\infty )=S_L\), \(S(+\infty )=S_R\), where the constant c is given by the Rankine–Hugoniot condition \(c=\frac{f(S_L)-f(S_R)}{S_L-S_R}\), the authors arrive at solutions constructed from elementary waves (i.e., solutions consisting of shock waves and rarefaction waves) to (1.3), (1.4) that are non-standard (i.e., non-admissible in the sense of Oleinik). Moreover, for some values \(S_L\), such a solution exhibits an overshoot-type phenomenon (see [53, Figure 7]).

In this paper, we shall make a step forward in the sense that we shall consider a multi-dimensional generalization to (1.5) with a flux \({{\mathfrak {f}}}={{\mathfrak {f}}}(\mathbf{x},\lambda )\) explicitly depending on the space variable and we shall analyze the dynamics capillarity limit to (1.5) as \(\varepsilon \rightarrow 0\) for arbitrary initial data.

The \(\mathbf{x}\)-dependence of the flux means that we assume that the medium in which we consider the phenomenon is heterogeneous, i.e., that it has different properties at different points (e.g., since in some parts of the medium we have sand while in some others clay). Moreover, we shall assume that the flux is discontinuous with respect to the space variable \(\mathbf{x}\), which means that the medium experiences abrupt (discontinuous) changes in its properties (one can imagine that we have sand which is highly permeable adjacent to the clay layer, which is weakly permeable—permeability is discontinuous in such a medium). Let us remark here that, stipulated by different applications, evolutionary equations with discontinuous flux have attracted considerable attention recently. For a (non-exhaustive) selection of recent results, cf. [2, 3, 12, 29, 43] and references therein.

Explicitly, we consider the conservation law

$$\begin{aligned} \partial _t u+\mathrm {div}{{\mathfrak {f}}}(\mathbf{x},u)=0, \end{aligned}$$
(1.6)

where

  1. (C1)

    \({{\mathfrak {f}}}= (f_i)_{i=1}^d = (\mathbf{x},\lambda )\mapsto {{\mathfrak {f}}}(\mathbf{x},\lambda )\), \(f_i \in L^1({\mathbb {R}}^d\times {\mathbb {R}}) \cap L^p({\mathbb {R}}^d\times {\mathbb {R}})\) and \(\partial _{\lambda } f_i \in L^p({\mathbb {R}}^d\times {\mathbb {R}})\) for some \(1<p<\infty \) and all \(i=1,\ldots , d\);

  2. (C2)

    \(\mathrm {div}_{\mathbf{x}} {{\mathfrak {f}}}(\mathbf{x},\xi ) \in {{{\mathcal {M}}}}({\mathbb {R}}^d\times {\mathbb {R}})\), where \({{{\mathcal {M}}}}({\mathbb {R}}^d\times {\mathbb {R}})\) is the space of Radon measures, and there exists a constant \(\beta >0\) and a finite Radon measure \(\mu \in {{{\mathcal {M}}}}({\mathbb {R}}^d)\) such that (in the sense of measures)

    $$\begin{aligned} |\mathrm {div}_{\mathbf{x}} {{\mathfrak {f}}}(\mathbf{x},\xi )|\le \frac{\mu (\mathbf{x})}{1+|\lambda |^{1+\beta }}; \end{aligned}$$
  3. (C3)

    \(\Vert \sup \limits _{\lambda \in {\mathbb {R}}} |\partial _{\lambda }f_i(\cdot ,\lambda )| \Vert _{L^1({\mathbb {R}}^d)} <\infty \) for all \(i=1,\ldots , d\);

We then perturb (1.6) by (a linearized) vanishing dynamic capillarity limit:

$$\begin{aligned} \partial _t u_{\varepsilon }+\mathrm {div}{{\mathfrak {f}}}_\varepsilon (\mathbf{x},u_\varepsilon )=\varepsilon \Delta u_\varepsilon +\delta (\varepsilon )\Delta \partial _t u_\varepsilon . \end{aligned}$$
(1.7)

Here, \({{\mathfrak {f}}}_\varepsilon := K_\varepsilon \cdot {{\mathfrak {f}}}\star \omega _{n(\varepsilon )}\), where componentwise convolution \({{\mathfrak {f}}}\star \omega _{n(\varepsilon )}\) is a regularization of \({{\mathfrak {f}}}\) and \(K_\varepsilon =K_\varepsilon (\mathbf{x},\lambda )\) is a bounded family of compactly supported functions equal to one on the ball \(B(0,1/\varepsilon )\subset {\mathbb {R}}^{d+1}\) and such that \(\Vert \nabla K_\varepsilon \Vert _{L^\infty ({\mathbb {R}}^{d+1})} \le 1\). We remark here that (1.7) reminds on the diffusion–dispersion regularization (see, e.g., [4, 31]), but it contains t-derivative in the third-order term which makes significant difference between the two situations. We will comment on this in more details later.

More precisely, we suppose that \(\omega \in C^\infty _c({\mathbb {R}}^{d+1})\) is of the form \(\omega (\mathbf{x},\lambda )=\omega ^{(1)}(\mathbf{x})\) \(\omega ^{(2)}(\lambda )\) with \(\omega ^{(1)}, \omega ^{(2)}\) test functions with unit integral on \({\mathbb {R}}^d\) and \({\mathbb {R}}\), respectively, and

$$\begin{aligned} \omega _{n(\varepsilon )}(\mathbf{x},\lambda ) \equiv \omega ^{(1)}_{n(\varepsilon )}(\mathbf{x})\omega ^{(2)}_{n(\varepsilon )}(\lambda ) = n(\varepsilon )^{-d} \omega ^{(1)}(\mathbf{x}/n(\varepsilon ))n(\varepsilon )^{-1} \omega ^{(2)}(\lambda /n(\varepsilon )) \end{aligned}$$
(1.8)

with an appropriate \(n(\varepsilon )\) such that \(n(\varepsilon )\rightarrow 0\) as \(\varepsilon \rightarrow 0\) (the form of \(n(\varepsilon )\) will be precisely determined later). Note that

$$\begin{aligned} K_\varepsilon {{\mathfrak {f}}}_\varepsilon \rightarrow {{\mathfrak {f}}}\text { strongly in } L^p({\mathbb {R}}^d\times {\mathbb {R}}) \text { as } \varepsilon \rightarrow 0. \end{aligned}$$

We supplement (1.6) with the initial data

$$\begin{aligned} u|_{t=0}=u_0(\mathbf{x})\in L^1({\mathbb {R}}^d) \cap L^\infty ({\mathbb {R}}^d), \end{aligned}$$
(1.9)

while we take

$$\begin{aligned} u_\varepsilon \big |_{t=0}=u_{0}^\varepsilon (\mathbf{x}) \rightarrow u_0(\mathbf{x}) \ \ \text {strongly in} \ \ L^p({\mathbb {R}}^d), \end{aligned}$$
(1.10)

where, for every \(i=1,\ldots ,d\),

$$\begin{aligned} \Vert u_{0}^{\varepsilon }\Vert _{L^{2}({\mathbb {R}}^{d})}+n(\varepsilon ) \Vert \nabla u_{0}^{\varepsilon }\Vert _{L^{2}({\mathbb {R}}^{d})}+n(\varepsilon )^{2}\Vert \nabla u^\varepsilon _{0x_{i}}\Vert _{L^{2}({\mathbb {R}}^{d})}\le C_{0}. \end{aligned}$$
(1.11)

It is not difficult to see that \(u_{0}^\varepsilon =u_0\star \omega ^{(1)}_{n(\varepsilon )}\), where \(\omega ^{(1)}_{n(\varepsilon )}\) is as above, satisfies (1.11). We shall, however, use (1.11) as an assumption.

With regard to pseudo-parabolic equations, we have already explained their importance in the porous media theory. Besides a possible justification of the experimental results from [14], there is a confirmed application to the seepage of homogeneous fluids through a fissured rock [5]. Also, they describe the unidirectional propagation of nonlinear, dispersive long waves [6, 51] (where u is typically the amplitude or velocity), or population dynamics [41] (where u represents the population density).

In [48, 52], the authors investigated the initial-boundary value problem and the Cauchy problem for linear pseudo-parabolic equations and established the existence and uniqueness of solutions. As for the nonlinear variants, one can find numerous results even for singular pseudo-parabolic equations and degenerate pseudo-parabolic equations (see, e.g., [9, 28, 30, 34, 39, 44] and the references therein). Together with the existence and uniqueness results, among the given references, one can also find properties of solutions, such as asymptotic behavior and regularity. However, we are not aware of any corresponding results for the Cauchy problem for an equation of type (1.5).

This problem will be considered in Sect. 2. We shall use an approach that is characteristic for the theory of wave equations [47]: we shall apply the Fourier transform with respect to \(\mathbf{x}\), solve the ordinary differential equation so obtained with respect to t and prove the following theorem (the constants \(\varepsilon \) and \(\delta \) are omitted for simplicity):

Theorem 1.1

There exists a unique solution to

$$\begin{aligned} u_{t}+\mathrm {div}({{\mathfrak {f}}}(\mathbf{x}, u)) =\triangle u+\partial _{t}\triangle u, \ \ (t,x)\in [0,T)\times {\mathbb {R}}^d, \end{aligned}$$
(1.12)

where \(T>0\), \({{\mathfrak {f}}}\in C^\infty _c({\mathbb {R}}^d\times {\mathbb {R}})\) is a bounded function, supplemented with (merely) the initial condition

$$\begin{aligned} u|_{t=0}=u_0(\mathbf{x}) \in C_c^\infty ({\mathbb {R}}^d), \end{aligned}$$
(1.13)

which belongs to \(L^2((0,T)\times {\mathbb {R}}^d)\cap C^\infty ((0,T)\times {\mathbb {R}}^d)\).

In the next step (Sect. 3), we are going to let the perturbation parameter \(\varepsilon \rightarrow 0\) in (1.7). This type of problem for conservation laws (when there is not only vanishing viscosity, but also third- or higher-order perturbation) was first addressed in [46] and received considerable attention after that. A thorough analysis of this kind of limit in the one-dimensional situation with a regular flux can be found in [38], where the theory of non-classical shocks for conservation laws was essentially initiated. The standard approach here is to rewrite the equation under consideration in the kinetic formulation or using Young measures (which is essentially equivalent), and then applying velocity averaging results [4, 26], compensated compactness (in one-dimensional situations) [10, 11, 46], or Di Perna techniques involving Young measures [15], as done in [31] and many others. Again we note that this list of citations is far from complete. The problem in our case is the low regularity of the flux function \({{\mathfrak {f}}}\) (it can be discontinuous with respect to \(\mathbf{x}\in {\mathbb {R}}^d\)) as well as the multidimensional character of our problem, which prevents us from using the Young measures and compensated compactness approach here (see item (iii) below and Remark 3.9 for the case of a regular flux). We also remark that, to the best of our knowledge, the only diffusion–dispersion type result for equations with discontinuous coefficients is [24].

Moreover, unlike the situation that is typical in the case of the diffusion–dispersion limit ([4, 31, 46] etc.), where the existence of a solution to the perturbed equation is assumed together with all necessary properties of the solution, we have proved in the previous section the existence of the functional sequence whose convergence we analyze.

As for the velocity averaging theory, most of the results are given in the case of a homogeneous flux [16, 21, 42, 50] or a flux for which \(p \ge 2\) [19, 35]. We have recently proved [36] the velocity averaging lemma in the case \(p>1\) and this will enable us to prove the strong convergence of the sequence \((u_\varepsilon )\) of solutions to (1.7), (1.10) along a subsequence toward a weak solution to (1.6). The following statements are the main results of the paper:

  1. (i)

    If \(\delta =o(\varepsilon ^2)\) and

    $$\begin{aligned} \frac{\sqrt{\delta }}{\varepsilon n(\varepsilon )^{d+2}} \rightarrow 0 \end{aligned}$$
    (1.14)

    as \(\varepsilon \rightarrow 0\), then the family \((u_\varepsilon )\) contains a strongly converging subsequence in \(L^1_{loc}({\mathbb {R}}^+\times {\mathbb {R}}^d)\) under a suitable non-degeneracy condition (see Definition 3.2);

  2. (ii)

    If \(\delta ={{{\mathcal {O}}}}(\varepsilon ^2)\) and \(\frac{\sqrt{\delta }}{\sqrt{\varepsilon } n(\varepsilon )^{d/2+2}}\rightarrow 0\) as \(\varepsilon \rightarrow 0\), and the \(\lambda \)-derivative of the flux \(\partial _\lambda {{\mathfrak {f}}}\) is bounded in addition to condition (C3), then the family \((u_\varepsilon )\) is strongly precompact in \(L^1_{loc}({\mathbb {R}}^+\times {\mathbb {R}}^d)\) under the non-degeneracy condition;

  3. (iii)

    If \(\delta =o(\varepsilon ^2)\) and \(\frac{\sqrt{\delta }}{\sqrt{\varepsilon } n(\varepsilon )^{d/2+2}}\rightarrow 0\) as \(\varepsilon \rightarrow 0\), and if the flux \({{\mathfrak {f}}}\in C^1({\mathbb {R}}^d\times {\mathbb {R}})\), then the family \((u_\varepsilon )\) converges strongly in \(L^1_{loc}({\mathbb {R}}^+\times {\mathbb {R}}^d)\) toward the entropy solution to (1.6), (1.9) (no non-degeneracy condition is needed).

We conclude the introduction by noting that the subsequence of \((u_\varepsilon )\) given in (i) does not converge toward an entropy admissible solution to the underlying conservation law since its flux is not smooth and for such fluxes the well-posedness theory is not developed yet in full generality (compare with item (iii) above).

Also, in the case (ii), even when the flux is smooth, we do not necessarily have convergence toward the Kruzhkov entropy solution (see [53] for stepwise initial data). On the other hand, the approximating procedure actually models a physical situation in which diffusion and capillary limits have equal effect on the process and, as any physical phenomenon in the macro-world is well-posed, it is therefore expected to generate a stable semigroup of solutions to the underlying conservation law. We will deal with this question in a future research.

The paper is organized as follows: In Sect. 2, we prove existence and uniqueness for the pseudo-parabolic problem (1.12), (1.13). In Sect. 3, we derive necessary estimates from (1.7), (1.10) and use it to prove the strong convergence toward a weak solution to the underlying conservation law.

2 Existence and uniqueness of the solution to the pseudo-parabolic equation (1.12) with (1.13)

Throughout this section, we suppose that \({{\mathfrak {f}}}\) is smooth and compactly supported. Under this assumption, we want to show that (1.12) with the initial condition (1.13) has a unique solution. The strategy for solving this problem is to define a mapping \({{{\mathcal {T}}}}: L^2([0,T)\times {\mathbb {R}}^d) \rightarrow L^2([0,T)\times {\mathbb {R}}^d)\) such that for every \(v\in L^2([0,T)\times {\mathbb {R}}^d)\) the function

$$\begin{aligned} u={{{\mathcal {T}}}}(v) \end{aligned}$$
(2.1)

represents a solution to

$$\begin{aligned} u_{t}+\mathrm {div}({{\mathfrak {f}}}(\mathbf{x}, v)) =\triangle u+\partial _{t}\triangle u, \ \ (t,x)\in [0,T)\times {\mathbb {R}}^d, \end{aligned}$$
(2.2)

with the initial conditions (1.13). Generally, for \(u_0\in L^2({\mathbb {R}}^d)\), we are seeking weak solutions u of this initial value problem in the sense that, for any smooth test function \(\varphi \) with compact support in [0, T), we require

$$\begin{aligned} \int \limits _0^T\int \limits _{{\mathbb {R}}^d} u\varphi _t + f(x,v)\mathrm {div}\varphi - u\Delta \varphi + u\partial _t \Delta \varphi \,\mathrm{d}x\mathrm{d}t + \int \limits _{{\mathbb {R}}^d} u_0(\varphi (0,x)+\Delta \varphi (0,x))\,\mathrm{d}x=0. \end{aligned}$$

Then, we shall prove that the mapping \({{{\mathcal {T}}}}\) possesses a fixed point, which will turn out to be the solution to (1.12), (1.13).

To this end, we need the following consequence of the Leray–Schauder fixed point theorem (cf. [20, Th. 11.3]):

Theorem 2.1

Let \({{{\mathcal {T}}}}\) be a compact mapping of a Banach space \({\mathcal {B}}\) into itself and suppose that there exists a constant C such that

$$\begin{aligned} \Vert u\Vert _{{{{\mathcal {B}}}}}\le C \end{aligned}$$
(2.3)

for all \(u\in {{{\mathcal {B}}}}\) and \(\sigma \in [0,1]\) satisfying \(u=\sigma {{{\mathcal {T}}}}u\). Then, \({{{\mathcal {T}}}}\) has a fixed point, that is, \({{{\mathcal {T}}}}u=u\) for some \(u\in {\mathcal {B}}\).

Finally, let us fix our conventions for Fourier transform and inverse Fourier transform. For \(u\in L^1({\mathbb {R}}^d)\cap L^2({\mathbb {R}}^d)\), the Fourier transform of u is

$$\begin{aligned} {{{\mathcal {F}}}}(u)(\varvec{\xi })={\hat{u}}(\varvec{\xi })=\int \limits _{{\mathbb {R}}^d} e^{- i \mathbf{x}\cdot \varvec{\xi }} u(\mathbf{x})\, \mathrm{d}\mathbf{x}\end{aligned}$$

and the inverse Fourier transform is given by

$$\begin{aligned} {{{\mathcal {F}}}}^{-1}(u)(\mathbf{x})={\check{u}}(x)= \frac{1}{(2\pi )^d}\int \limits _{{\mathbb {R}}^d} e^{i \mathbf{x}\cdot \varvec{\xi }} u(\varvec{\xi })\, \mathrm{d}\varvec{\xi }, \end{aligned}$$

where \(\cdot \) denotes the scalar product and i is the imaginary unit.

We begin by proving existence and uniqueness of solutions to (2.2), (1.13).

Lemma 2.2

Let \(v\in L^2([0,T]\times {\mathbb {R}}^d)\). Then, for any \(u_0\in L^2({\mathbb {R}}^d)\) there exists a unique solution \(u\in L^2([0,T]\times {\mathbb {R}}^d)\) to (2.2) with \(u|_{t=0}=u_0\).

Proof

After applying the Fourier transform with respect to \(\mathbf{x}\in {\mathbb {R}}^d\) to (2.2), we obtain

$$\begin{aligned} \partial _t {\hat{u}}(t,\varvec{\xi })+i \varvec{\xi }\cdot \widehat{{{\mathfrak {f}}}(\cdot ,v(t,\cdot ))}(\varvec{\xi })=-|\varvec{\xi }|^2 {\hat{u}}(t,\varvec{\xi }) - |\varvec{\xi }|^2 \partial _t {\hat{u}}(t,\varvec{\xi }), \end{aligned}$$

where \(|\varvec{\xi }|=(\xi _1^2+\cdots \xi _d^2)^{\frac{1}{2}}\). We rewrite the last equation in the form

$$\begin{aligned} \frac{\mathrm{d} {\hat{u}}(t,\varvec{\xi })}{\mathrm{d}t} +\frac{|\varvec{\xi }|^2}{{1+|\varvec{\xi }|^2}}{\hat{u}}(t,\varvec{\xi }) =-\frac{i \varvec{\xi }\cdot \widehat{{{\mathfrak {f}}}(\cdot ,v(t,\cdot ))}(\varvec{\xi })}{1+|\varvec{\xi }|^2} \end{aligned}$$

It follows from this that any weak solution to (2.2) has t-derivative in \(L^2\) as well, hence in particular is continuous with respect to t and therefore possesses a classical trace on \(t=0\). Solving the ODE with the initial data \({\hat{u}}(0,\varvec{\xi })={\hat{u}}_0(\varvec{\xi })\), we arrive at

$$\begin{aligned} {\hat{u}}(t,\varvec{\xi })=e^{-\frac{|\varvec{\xi }|^2 t}{1+|\varvec{\xi }|^2}} \left( {\hat{u}}_0(\varvec{\xi })-\int \limits _0^t\frac{i \varvec{\xi }\cdot {\hat{{{\mathfrak {f}}}}}}{1+|\varvec{\xi }|^2} e^{\frac{|\varvec{\xi }|^2 t'}{1+|\varvec{\xi }|^2}}\,\mathrm{d}t'\right) , \end{aligned}$$
(2.4)

where \({\hat{{{\mathfrak {f}}}}}=\widehat{{{\mathfrak {f}}}(\cdot ,v(t,\cdot ))}(\varvec{\xi })\). By finding the inverse Fourier transform here with respect to \(\varvec{\xi }\in {\mathbb {R}}^d\), we indeed obtain a weak solution, so combined with the above considerations both existence and uniqueness follow. \(\square \)

Theorem 2.3

For any function \(v\in L^2([0,T]\times {\mathbb {R}}^d)\), the solution to (2.2), (1.13) belongs to \(H^1((0,T)\times {\mathbb {R}}^d)\).

Proof

Let u be the solution to (2.2), (1.13) defined in Lemma 2.2. We need to estimate \(\partial _{x_j} u\), \(j=1,\ldots ,d\) and \(\partial _t u\), where \(\partial _{x_j}\), \(j=1,\ldots , d\) and \(\partial _t\) are the weak derivatives of u. We have according to the Plancherel theorem (below, we take the Fourier transform with respect to \(\mathbf{x}\in {\mathbb {R}}^d\)) and the Cauchy–Schwartz inequality

$$\begin{aligned} \begin{aligned} \Vert \partial _{x_j} u\Vert _{L^2([0,T]\times {\mathbb {R}}^d)}&= \Vert i \xi _j {\hat{u}}\Vert _{L^2([0,T]\times {\mathbb {R}}^d)} \\&\le \Vert \xi _j {\hat{u}}_0 \Vert _{L^2([0,T]\times {\mathbb {R}}^d)}+\left\| \int \limits _0^t\frac{i \xi _j \varvec{\xi }\cdot {\hat{{{\mathfrak {f}}}}}}{1+|\varvec{\xi }|^2}e^{\frac{|\varvec{\xi }|^2 t'}{1+|\varvec{\xi }|^2}}\mathrm{d}t'\right\| _{L^2([0,T]\times {\mathbb {R}}^d)}\\&\le \Vert \xi _j {\hat{u}}_0 \Vert _{L^2([0,T]\times {\mathbb {R}}^d)}+\left( \int \limits _0^T \int \limits _{{\mathbb {R}}^d} \left| \int \limits _0^t \frac{i \xi _j \varvec{\xi }\cdot {\hat{{{\mathfrak {f}}}}}}{1+|\varvec{\xi }|^2}e^{\frac{|\varvec{\xi }|^2 t'}{1+|\varvec{\xi }|^2}}\mathrm{d}t' \right| ^2 \mathrm{d}\varvec{\xi }\mathrm{d}t\right) ^{1/2}\\&\le \Vert \xi _j {\hat{u}}_0 \Vert _{L^2([0,T]\times {\mathbb {R}}^d)}+\left( \int \limits _0^T \int \limits _{{\mathbb {R}}^d} t \int \limits _0^t \frac{\xi ^2_j |\varvec{\xi }\cdot {\hat{{{\mathfrak {f}}}}}|^2}{(1+|\varvec{\xi }|^2)^2}e^{\frac{2|\varvec{\xi }|^2 t'}{1+|\varvec{\xi }|^2}}\mathrm{d}t' \mathrm{d}\varvec{\xi }\mathrm{d}t\right) ^{1/2}\\&\le \Vert \xi _j {\hat{u}}_0 \Vert _{L^2([0,T]\times {\mathbb {R}}^d)}+T e^{T} \left( \int \limits _0^T\int \limits _{{\mathbb {R}}^d} \frac{\xi _j^2 \, |\varvec{\xi }|^2 \, |{\hat{{{\mathfrak {f}}}}}|^2}{(1+|\varvec{\xi }|^2)^2} \mathrm{d}\varvec{\xi }\mathrm{d}t\right) ^{1/2}\\&\le \Vert \xi _j {\hat{u}}_0 \Vert _{L^2([0,T]\times {\mathbb {R}}^d)}+T e^{T} \left( \int \limits _0^T\int \limits _{{\mathbb {R}}^d} |{{\mathfrak {f}}}(\mathbf{x},v(t,\mathbf{x}))|^2 \mathrm{d}\mathbf{x}\mathrm{d}t\right) ^{1/2}, \end{aligned} \end{aligned}$$
(2.5)

where in the last step we used the Plancherel theorem again. From here, since f is compactly supported with respect to \(\mathbf{x}\in {\mathbb {R}}^d\) and bounded, we conclude that \(\partial _{x_j} u \in L^2([0,T]\times {\mathbb {R}}^d)\).

As for the t-derivative, we have

$$\begin{aligned} \partial _t {\hat{u}}= -\frac{|\varvec{\xi }|^2}{1+|\varvec{\xi }|^2} e^{\frac{-|\varvec{\xi }|^2 t}{1+|\varvec{\xi }|^2}} \left( {\hat{u}}_0(\varvec{\xi })-\int \limits _0^t\frac{i\varvec{\xi }\cdot {\hat{{{\mathfrak {f}}}}}}{1+|\varvec{\xi }|^2} \mathrm{d}t' \right) +\frac{i\varvec{\xi }\cdot {\hat{{{\mathfrak {f}}}}}}{1+|\varvec{\xi }|^2}, \end{aligned}$$
(2.6)

and therefore, repeating the procedure giving us the estimates on \(\partial _{x_j} u\), we get

$$\begin{aligned} \Vert \partial _t u \Vert _{L^2([0,T]\times {\mathbb {R}}^d)}=\Vert \partial _t {\hat{u}} \Vert _{L^2([0,T]\times {\mathbb {R}}^d)} \le C \end{aligned}$$

for a constant \(C\in {\mathbb {R}}\). \(\square \)

Based on this, we can proceed to applying Theorem 2.1 to prove existence of a solution to (1.12), (1.13).

Theorem 2.4

There exists a unique solution to (1.12), (1.13). This solution belongs to \(L^2([0,T]\times {\mathbb {R}}^d)\cap C^\infty ([0,T]\times {\mathbb {R}}^d)\).

Proof

Let us first take a sequence of balls \(B_n=B(0,n)\) centered at 0 and of radius n. Set \(u_0^n(\mathbf{x})=u_0(\mathbf{x}) \chi _{B_n}(\mathbf{x})\), where \(\chi _{B_n}: {\mathbb {R}}^d\rightarrow [0,1]\) is a smooth regularization of the characteristic function of the set \(B_n\) such that \(\chi _{B_n}(\mathbf{x})=1\) for \(\mathbf{x}\in B(0,n-1)\) and \(\chi _{B_n}(\mathbf{x})=0\) for \(\mathbf{x}\notin B(0,n)\). Then, we define the mapping \({{{\mathcal {T}}}}_n: L^2([0,T]\times B_n) \rightarrow L^2([0,T]\times B_n)\) by

$$\begin{aligned} {{{\mathcal {T}}}}_n (v) :=\chi _{B_n} {{{\mathcal {T}}}} (v\chi _{B_n}), \end{aligned}$$

where \({{{\mathcal {T}}}}\) is the operator from (2.1) with the initial data \(u_0 \chi _{B_n}\) and we extend \(v\chi _{B_n}\) by zero so it is defined on \([0,T]\times {\mathbb {R}}^d\). This map is continuous since \({{{\mathcal {T}}}}\) is continuous. In fact, by the proof of Theorem 2.3, T is in fact continuous as a map from \(L^2([0,T)\times {\mathbb {R}}^d)\) to \( H^1((0,T)\times {\mathbb {R}}^d)\). So, we can use Theorem 2.3 and the Rellich theorem (keeping in mind that \(B_n\) is a bounded set) to conclude that \({{{\mathcal {T}}}}_n\) is a compact mapping.

Let us check condition (2.3) from Theorem 2.1. If a function \(u\in L^2([0,T]\times B_n)\) satisfies \(u=\sigma {{{\mathcal {T}}}}_n u\), then the function \({\tilde{u}}:=\sigma {{{\mathcal {T}}}} (u \chi _{B_n})\) satisfies \(\chi _{B_n} {\tilde{u}}(t,\mathbf{x})=u(t,\mathbf{x})\) for \(\mathbf{x}\in B_{n}\) and is a solution of the Cauchy problem

$$\begin{aligned} \begin{aligned} {\tilde{u}}_{t}+\sigma \mathrm {div}({{\mathfrak {f}}}(\mathbf{x}, u \chi _{B_n}))&=\triangle {\tilde{u}}+\partial _{t}\triangle {\tilde{u}}, \ \ (t,x)\in [0,T)\times {\mathbb {R}}^d\\ {\tilde{u}}(0,\mathbf{x})&= \sigma u_0(\mathbf{x}) \chi _{B_n}(\mathbf{x}) \end{aligned} \end{aligned}$$

If we apply Fourier transform with respect to \(\mathbf{x}\in {\mathbb {R}}^d\) here and solve the ODE so obtained, we arrive at an expression analogous to (2.4):

$$\begin{aligned} \hat{{\tilde{u}}}(t,\varvec{\xi })=\sigma e^{-\frac{t|\varvec{\xi }|^2}{1+|\varvec{\xi }|^2}} \left( \widehat{u_0\chi _{B_n}}(\varvec{\xi })- \int \limits _0^t\frac{i \varvec{\xi }\cdot {\hat{{{\mathfrak {f}}}}}}{1+|\varvec{\xi }|^2} e^{\frac{t' |\varvec{\xi }|^2}{1+|\varvec{\xi }|^2}}\,\mathrm{d}t'\right) , \end{aligned}$$

where \({\hat{{{\mathfrak {f}}}}}={{{\mathcal {F}}}}({{\mathfrak {f}}}(\mathbf{x},u(t,\mathbf{x})\chi _{B_n}(t,\mathbf{x}))\) is the Fourier transform with respect to \(\mathbf{x}\). Now \( \Vert u \Vert _{L^2([0,T]\times B_n)} \le \Vert {\tilde{u}} \Vert _{L^2([0,T]\times {\mathbb {R}}^d)}\) and repeating the procedure from (2.5) and (2.6), we conclude that (2.3) holds for the mapping \({{{\mathcal {T}}}}_n\) (keeping in mind that n is fixed).

Thus, for every \(n\in {\mathbb {N}}\) there exists a function \(u_n\in L^2([0,T]\times B_n)\) solving

$$\begin{aligned} \begin{aligned} \partial _t{u_n}+ \mathrm {div}({{\mathfrak {f}}}(\mathbf{x}, u_n))&=\triangle u_n+\partial _{t}\triangle u_n, \ \ (t,x)\in [0,T)\times B_{n-1}\\ u_n(0,\mathbf{x})&=u_0(\mathbf{x}), \ \ \mathbf{x}\in B_{n-1} \end{aligned} \end{aligned}$$

Extending \(u_n\) by zero outside of \(B_n\), we thereby obtain a sequence \((u_n)\) that is bounded in \(L^2([0,T]\times {\mathbb {R}}^d)\), and locally bounded in \(H^1((0,T)\times {\mathbb {R}}^d)\). Thus, \((u_n)\) admits a subsequence that locally converges toward \(u\in L^2([0,T]\times {\mathbb {R}}^d)\). It is clear that the function u is the weak solution to (1.12), (1.13) and thus it must belong at least to \(H^1((0,T)\times {\mathbb {R}}^d)\).

Moreover, the proof of Theorem 2.3 with \(v=u\) is a classical example of a bootstrapping procedure, which enables us to conclude that u actually belongs to each \(H^k((0,T)\times {\mathbb {R}}^d)\) (\(k\in {\mathbb {N}}\)), hence is smooth (due to smoothness of \(u_0\) and \({{\mathfrak {f}}}\)).

Now we turn to the proof of uniqueness. Assume that \(u_1\) and \(u_2\) are solutions to (1.12), supplemented with the initial conditions \(u_1\big |_{t=0}=u_{10}\) and \(u_2\big |_{t=0}=u_{20}\). After applying the procedure from Theorem 2.3 we conclude that

$$\begin{aligned} {\hat{u}}_1(t,\varvec{\xi })-{\hat{u}}_2(t,\varvec{\xi })=e^{\frac{-t|\varvec{\xi }|^2}{1+|\varvec{\xi }|^2}} \left( {\hat{u}}_{10}(\varvec{\xi })-{\hat{u}}_{20}(\varvec{\xi }) -\int \limits _0^t \frac{i\varvec{\xi }\cdot ({\hat{{{\mathfrak {f}}}}}_1-{\hat{{{\mathfrak {f}}}}}_2)}{1+|\varvec{\xi }|^2}e^{\frac{t' |\varvec{\xi }|^2}{1+|\varvec{\xi }|^2}}\mathrm{d}t'\right) , \end{aligned}$$
(2.7)

where

$$\begin{aligned} {\hat{{{\mathfrak {f}}}}}_1-{\hat{{{\mathfrak {f}}}}}_2={{{\mathcal {F}}}}({{\mathfrak {f}}}(\mathbf{x},u_1(t,\mathbf{x}))-{{\mathfrak {f}}}(\mathbf{x},u_2(t,\mathbf{x}))). \end{aligned}$$

Let us first estimate

$$\begin{aligned} \left\| \frac{\varvec{\xi }\cdot ({\hat{{{\mathfrak {f}}}}}_1-{\hat{{{\mathfrak {f}}}}}_2)}{1+|\varvec{\xi }|^2}\right\| _{L^2({\mathbb {R}}^d)}&\le \Vert {\hat{{{\mathfrak {f}}}}}_1-{\hat{{{\mathfrak {f}}}}}_2\Vert _{L^2({\mathbb {R}}^d)} = \Vert {{\mathfrak {f}}}_1-{{\mathfrak {f}}}_2\Vert _{L^2({\mathbb {R}}^d)} \nonumber \\&\le \Vert \partial _\lambda {{\mathfrak {f}}}\Vert _{\infty }\Vert {\hat{u}}_1-{\hat{u}}_2\Vert _{L^2({\mathbb {R}}^d)}. \end{aligned}$$
(2.8)

Taking the \(L^2({\mathbb {R}}^d)\)-norm of (2.7), we get, due to (2.8):

$$\begin{aligned} \Vert {\hat{u}}_1(t,\cdot )-{\hat{u}}_2(t,\cdot )\Vert _{L^2({\mathbb {R}}^d)} \le \Vert {\hat{u}}_{10}-{\hat{u}}_{20}\Vert _{L^2({\mathbb {R}}^d)} + \Vert \partial _\lambda {{\mathfrak {f}}}\Vert _{\infty } \int \limits _0^t\Vert {\hat{u}}_1(t',\cdot )-{\hat{u}}_2(t',\cdot )\Vert _{L^2({\mathbb {R}}^d)} e^{t'} \mathrm{d}t'. \end{aligned}$$
(2.9)

Applying Gronwall’s inequality, we arrive at

$$\begin{aligned} \Vert {\hat{u}}_1(t,\cdot )-{\hat{u}}_2(t,\cdot )\Vert _{L^2({\mathbb {R}}^d)} \le \Vert {\hat{u}}_{10}-{\hat{u}}_{20}\Vert _{L^2({\mathbb {R}}^d)} e^{(e^t-1)\Vert \partial _\lambda {{\mathfrak {f}}}\Vert _{\infty }}. \end{aligned}$$

From here, we see that \(u_{10}=u_{20}\) entails \(u_1=u_2\). This concludes the proof. \(\square \)

3 Vanishing capillarity limit

In this section, we inspect the vanishing capillarity limit of (1.7). By Theorem 1.1, under the assumptions (C1)–(C3) from Sect. 1, the Eq. (1.7) with initial data (1.10) (satisfying (1.11)) possesses a unique solution \(u_\varepsilon \) in \(L^2((0,T)\times {\mathbb {R}}^d)\cap C^\infty ((0,T)\times {\mathbb {R}}^d)\). As announced in Sect. 1, our main result then is as follows:

Theorem 3.1

Under the above assumptions, if \((F_1,\ldots , F_d)=\partial _\lambda {{\mathfrak {f}}}\) lies in \(L^{{\bar{p}}'}({\mathbb {R}}^{d+1})\) for some \({\bar{p}}'\in \big (1,2+\frac{4}{d}\big )\) then:

  1. (i)

    If \(\delta =o(\varepsilon ^2)\) and \(n(\varepsilon )\) satisfies (1.14), then the family \((u_\varepsilon )\) contains a strongly convergent subsequence in \(L^1_{loc}({\mathbb {R}}^+\times {\mathbb {R}}^d)\) under the non-degeneracy condition given in Definition 3.2 below.

  2. (ii)

    If \(\delta ={{{\mathcal {O}}}}(\varepsilon ^2)\) and \(\frac{\sqrt{\delta }}{\sqrt{\varepsilon } n(\varepsilon )^{d/2+2}}\rightarrow 0\) as \(\varepsilon \rightarrow 0\), and the \(\lambda \)-derivative of the flux \(\partial _\lambda {{\mathfrak {f}}}\) is bounded in addition to condition (C3), then the family \((u_\varepsilon )\) contains a strongly convergent subsequence in \(L^1_{loc}({\mathbb {R}}^+\times {\mathbb {R}}^d)\) under the non-degeneracy condition;

  3. (iii)

    If \(\delta =o(\varepsilon ^2)\) and \(\frac{\sqrt{\delta }}{\sqrt{\varepsilon } n(\varepsilon )^{d/2+2}}\rightarrow 0\) as \(\varepsilon \rightarrow 0\), and if the flux \({{\mathfrak {f}}}\in C^1({\mathbb {R}}^d\times {\mathbb {R}})\), then the family \((u_\varepsilon )\) converges strongly in \(L^1_{loc}({\mathbb {R}}^+\times {\mathbb {R}}^d)\) toward the entropy solution to (1.6), (1.9).

Below we give a complete proof of (i). Although non-trivial, the other two claims are more standard and we shall only comment on the necessary modifications of the argument for establishing (i).

To proceed, let us first introduce the non-degeneracy condition. Such a condition is standard in many results using velocity averaging lemmas (cf., e.g., [36, 37, 42, 43, 50]).

Definition 3.2

The flux appearing in (1.6) is called non-degenerate if for the function \(\mathbf{F}=(F_1,\ldots , F_d)=\partial _\lambda {{\mathfrak {f}}}\), where \({{\mathfrak {f}}}={{\mathfrak {f}}}(\mathbf{x},\lambda )\) we have: for almost every \((t,\mathbf{x})\in {\mathbb {R}}_+^{d}\) and every \(\varvec{\xi }\in S^{d}\), where \(S^d\) is the unit sphere in \({\mathbb {R}}^{d+1}\) the mapping

$$\begin{aligned} \lambda \mapsto \left( \xi _0+ \sum \limits _{k=1}^d F_k(\mathbf{x},\lambda ) \xi _k \right) \!, \end{aligned}$$
(3.1)

is not zero on any set of positive measure.

Under these assumptions, we proved the following assertion in [36, Theorem 3.4].

Theorem 3.3

Assume that \(h_n\rightharpoonup h\) weakly in \({L}_{loc}^{s}({\mathbb {R}}^{d+1})\) for some \(s\ge 2\), where \(h_n\) are weak solutions to

$$\begin{aligned} \begin{aligned} \sum \limits _{k=1}^d\partial _{x_k} \left( F_k(\mathbf{x},\lambda ) h_n(\mathbf{x},\lambda )\right) =\partial _\lambda ^\kappa G_n(\mathbf{x},\lambda )\,. \end{aligned} \end{aligned}$$
(3.2)

Here, \(\kappa \in {\mathbb {N}}_0\) and \(\lambda \in {\mathbb {R}}\). Let \({\bar{p}}\in (1,s)\) and let \(\frac{1}{{\bar{p}}}+\frac{1}{{\bar{p}}'}=1\). We assume

(a):

\(F_k\in { L}^{{{\bar{p}}}'}({\mathbb {R}}^{d+1})\) for \(k=1,\ldots ,d, \ \)

(b):

The sequence \((G_n)\) is strongly precompact in the space \({ W}_{loc}^{-1,r}({\mathbb {R}}^{d+1})\), where \(r>1\) satisfies the relation \(1+\frac{1}{s}=\frac{1}{{\bar{p}}}+\frac{1}{r}\).

Finally, assume that the non-degeneracy condition (3.1) is satisfied.

Then, for any \(\rho \in L^2_c({\mathbb {R}})\), there exists a subsequence \((h_{n_k})\) of \((h_n)\) such that

$$\begin{aligned} \int \limits _{{\mathbb {R}}}\rho (\lambda ) h_{n_k}(\mathbf{x},\lambda )\mathrm{d}\lambda \rightarrow \int \limits _{{\mathbb {R}}}\rho (\lambda ) h(\mathbf{x},\lambda )\mathrm{d}\lambda \ \ \text { strongly in } L^1_{loc}({\mathbb {R}}^d) \end{aligned}$$
(3.3)

as \(k\rightarrow \infty \).

Now, we proceed to the estimates sufficient to apply the mentioned functional analytic tools.

Lemma 3.4

The solution \(u_{\varepsilon }\) of (1.7) satisfies the following a priori estimates:

$$\begin{aligned} \Vert u_{\varepsilon }(t,\,.\,)\Vert _{L^{2}({\mathbb {R}}^{d})}&\le \Vert u_{0}^{\varepsilon }\Vert _{L^{2}({\mathbb {R}}^{d})}+\sqrt{\delta }\Vert \nabla u^\varepsilon _{0}\Vert _{L^{2}({\mathbb {R}}^{d})}+\sqrt{2 t} \Vert \mathrm {div}{{\mathfrak {f}}}_\varepsilon (\mathbf{x},\lambda ) \Vert ^{\frac{1}{2}}_{L^{1}({\mathbb {R}}^{d}\times {\mathbb {R}})}, \end{aligned}$$
(3.4)
$$\begin{aligned} \sqrt{2\varepsilon }\Vert \nabla u_{\varepsilon }\Vert _{L^{2}([0,T],L^{2}({\mathbb {R}}^{d}))}&\le \Vert u_{0}^{\varepsilon }\Vert _{L^{2}({\mathbb {R}}^{d})} +\sqrt{\delta } \Vert \nabla u_{0}^{\varepsilon }\Vert _{L^{2}({\mathbb {R}}^{d})} +\,\sqrt{2 T} \Vert \mathrm {div}{{\mathfrak {f}}}_\varepsilon (\mathbf{x},\lambda ) \Vert ^{\frac{1}{2}}_{L^{1}({\mathbb {R}}^{d}\times {\mathbb {R}})},\end{aligned}$$
(3.5)
$$\begin{aligned} \sqrt{\delta } \Vert \nabla u_{\varepsilon }(t,\,.\,) \Vert _{L^{2}({\mathbb {R}}^{d})}&\le \Vert u_{0}^{\varepsilon }\Vert _{L^{2}({\mathbb {R}}^{d})}+\sqrt{\delta } \Vert \nabla u_{0}^{\varepsilon }\Vert _{L^{2}({\mathbb {R}}^{d})} +\,\sqrt{2 t} \Vert \mathrm {div}{{\mathfrak {f}}}_\varepsilon (\mathbf{x},\lambda ) \Vert ^{\frac{1}{2}}_{L^{1}({\mathbb {R}}^{d}\times {\mathbb {R}})} \end{aligned}$$
(3.6)
$$\begin{aligned} \varepsilon \delta \Vert \nabla \partial _{x_i} u_{\varepsilon } \Vert ^2_{L^{2}([0,T],L^{2}({\mathbb {R}}^{d}))}&\le 2\delta \Vert \partial _{x_i}u_{0}^{\varepsilon }\Vert ^2_{L^{2}({\mathbb {R}}^{d})}+2\delta ^2 \Vert \nabla \partial _{x_i} u_{0}^{\varepsilon }\Vert ^2_{L^{2}({\mathbb {R}}^{d})} \nonumber \\&\quad +\,\frac{\mathrm{d}\delta }{\varepsilon ^2} \Vert \partial _\lambda {{\mathfrak {f}}}_\varepsilon \Vert ^2_{\infty } ( \Vert u_{0}^{\varepsilon }\Vert ^2_{L^{2}({\mathbb {R}}^{d})}+\delta \Vert \nabla u_{0}^{\varepsilon }\Vert ^2_{L^{2}({\mathbb {R}}^{d})} \nonumber \\&\quad +\,2 T \Vert \mathrm {div}{{\mathfrak {f}}}_\varepsilon (\mathbf{x},\lambda ) \Vert _{L^{1}({\mathbb {R}}^{d}\times {\mathbb {R}})}) +\frac{4 T \delta }{\varepsilon } \Vert \mathrm {div}{{\mathfrak {f}}}^\varepsilon (\mathbf{x}, \lambda )\Vert ^2_{L^2({\mathbb {R}}^d, L^\infty ({\mathbb {R}}))}\end{aligned}$$
(3.7)
$$\begin{aligned} \delta \Vert \nabla \partial _t u_\varepsilon \Vert _{L^2([0,T],L^2({\mathbb {R}}^d))}&\le \frac{\sqrt{\delta } \sqrt{d}}{2\sqrt{\varepsilon }} \Vert \partial _\lambda {{\mathfrak {f}}}_\varepsilon \Vert _{\infty } \Vert u^\varepsilon _{0}\Vert _{L^{2}({\mathbb {R}}^{d})}\nonumber \\&\quad +\,\sqrt{\delta } \left( \frac{\sqrt{\delta }\sqrt{d}}{2\sqrt{\varepsilon }} \Vert \partial _\lambda {{\mathfrak {f}}}_\varepsilon \Vert _{\infty } +\frac{\sqrt{\varepsilon }}{\sqrt{2}}\right) \Vert \nabla u^\varepsilon _{0}\Vert _{L^{2}({\mathbb {R}}^{d})} \nonumber \\&\quad +\,\frac{\sqrt{\delta } \sqrt{\mathrm{d}T}}{\sqrt{2\varepsilon }} \Vert \partial _\lambda {{\mathfrak {f}}}_\varepsilon \Vert _{\infty } \Vert \mathrm {div}{{\mathfrak {f}}}_\varepsilon \Vert _{L^1({\mathbb {R}}^d\times {\mathbb {R}})}^{\frac{1}{2}}+\sqrt{\delta } \sqrt{T} \Vert \mathrm {div}{{\mathfrak {f}}}_\varepsilon \Vert _{L^2({\mathbb {R}}^d,L^\infty ({\mathbb {R}}))}. \end{aligned}$$
(3.8)

Proof

Let us consider a family \((u_{\varepsilon })\) of smooth solutions to (1.7). Multiplying the equation by \(\eta '(u_\varepsilon )\) (with \(\eta (u_\varepsilon )=u_\varepsilon ^{2})\), we have

$$\begin{aligned}&\partial _t(u_\varepsilon ^{2})+2\sum _{j=1}^{d}q^\varepsilon _{j}(\mathbf{x},u_{\varepsilon })_{x_j}+2 \int \limits _0^{u_{\varepsilon }} \mathrm {div}{{\mathfrak {f}}}_\varepsilon (\mathbf{x},\lambda ) \mathrm{d}\lambda \\&\quad =2\varepsilon \sum _{j=1}^{n}(u_\varepsilon \partial _{x_j}u_{\varepsilon })_{x_j}-2\varepsilon |\nabla u_\varepsilon |^{2}+ \delta (\triangle \eta (u_\varepsilon ))_{t}-2\delta \sum _{j=1}^{n}\eta (\partial _{x_j}u_{\varepsilon })_t-2\delta \triangle u_{\varepsilon } \partial _t u_{\varepsilon }, \nonumber \end{aligned}$$
(3.9)

where \(\mathbf{q}_\varepsilon \) is defined by (recall that \({{\mathfrak {f}}}_\varepsilon =(f_1^\varepsilon ,\ldots ,f_d^\varepsilon )\))

$$\begin{aligned} \partial _\lambda \mathbf{q}_\varepsilon (\mathbf{x},\lambda )=(\partial _\lambda F^\varepsilon _1(\mathbf{x},\lambda ), \ldots , \partial _\lambda F^\varepsilon _d(\mathbf{x},\lambda )) =(\lambda \partial _\lambda f^\varepsilon _1(\mathbf{x},\lambda ),\ldots , \lambda \partial _\lambda f^\varepsilon _d(\mathbf{x},\lambda )), \end{aligned}$$

and normalized by the condition \(q^\varepsilon _{j}(\mathbf{x},0)=0\), \(j=1,\ldots ,d\). Integrating over \(\mathbf{x}\in {\mathbb {R}}^d\) and \(0\le t\le T\) and using integration by parts, we get

$$\begin{aligned}&\int \limits _{{\mathbb {R}}^{d}}(|u_\varepsilon (t) |^{2} -|u^\varepsilon _{0}|^{2})\mathrm{d}x+2 \int \limits _0^t \int \limits _{{\mathbb {R}}^d}\int \limits _0^{u_{\varepsilon }(\mathbf{x},t')} \mathrm {div}{{\mathfrak {f}}}_\varepsilon (\mathbf{x},\lambda ) \,\mathrm{d}\lambda \mathrm{d}\mathbf{x}\mathrm{d}t'\\&\quad =-2\varepsilon \int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}}|\nabla u_\varepsilon |^{2}\mathrm{d}\mathbf{x}\mathrm{d}t'+\delta \int \limits _{{\mathbb {R}}^{d}}\triangle (\eta (u_\varepsilon (t,\mathbf{x}))-\eta (u^\varepsilon _0(\mathbf{x}))\mathrm{d}\mathbf{x}\\&\qquad -2\delta \int \limits _{{\mathbb {R}}^{d}}\sum _{j=1}^{d}(|\partial _{x_j} u_{\varepsilon }(t,\mathbf{x})|^{2}-|\partial _{x_j} u^\varepsilon _{0}(\mathbf{x})|^{2}(\mathbf{x}))\mathrm{d}\mathbf{x}-2\delta \int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}}\triangle u_{\varepsilon } \partial _t u_{\varepsilon }\mathrm{d}\mathbf{x}\mathrm{d}t'\\&\quad =-2\varepsilon \int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}}|\nabla u_\varepsilon |^{2}\mathrm{d}\mathbf{x}\mathrm{d}t'-2\delta \int \limits _{{\mathbb {R}}^{d}}\sum _{j=1}^{d}(|\partial _{x_j}u_{\varepsilon }(t,\mathbf{x})|^{2} -|\partial _{x_j}u^\varepsilon _{0}|^{2}(\mathbf{x}))\mathrm{d}\mathbf{x}\\&\qquad +\delta \sum _{j=1}^{d}\int \limits _{{\mathbb {R}}^{d}}\int \limits _{0}^{t}\eta (\partial _{x_j}u_{\varepsilon } )_{t}\mathrm{d}t\mathrm{d}x=-2\varepsilon \int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}}|\nabla u_\varepsilon |^{2}\,\mathrm{d}\mathbf{x}\mathrm{d}t'\\&\qquad -\delta \int \limits _{{\mathbb {R}}^{d}}\sum _{j=1}^{d}|\partial _{x_j} u_{\varepsilon }(t,\mathbf{x})|^{2} -|\partial _{x_j} u^\varepsilon _{0}(\mathbf{x})|^{2}(\mathbf{x}))\mathrm{d}\mathbf{x}\end{aligned}$$

i.e.,

$$\begin{aligned}&\int \limits _{{\mathbb {R}}^{d}}|u_\varepsilon (t)|^{2}\mathrm{d}x+2\varepsilon \int \limits _{0}^{t} \int \limits _{{\mathbb {R}}^{d}}|\nabla u_\varepsilon |^{2}\mathrm{d}x\mathrm{d}t'+\delta \sum _{j=1}^{d}\int \limits _{{\mathbb {R}}^{d}}|\partial _{x_j}u_{\varepsilon }|^{2}\mathrm{d}\mathbf{x}\nonumber \\&\quad =\int \limits _{{\mathbb {R}}^{d}}|u_{0}^\varepsilon |^{2}\mathrm{d}x+\delta \sum _{j=1}^{d}\int \limits _{{\mathbb {R}}^{d}}|\partial _{x_j} u^\varepsilon _{0}(\mathbf{x})|^{2}(x)\mathrm{d}x\nonumber \\&\qquad -2 \int \limits _0^t \int \limits _{{\mathbb {R}}^d}\int \limits _0^{u_{\varepsilon }(\mathbf{x},t')} \mathrm {div}{{\mathfrak {f}}}_\varepsilon (\mathbf{x},\lambda ) \mathrm{d}\lambda \mathrm{d}\mathbf{x}\mathrm{d}t'. \end{aligned}$$
(3.10)

The relations (3.4), (3.5) and (3.6) are easy consequences of (3.10).

To show (3.7), let us now differentiate equation (1.7) with respect to \(x_{i}\) and then multiply it by \(2\partial _{x_i}u_{\varepsilon }.\) Since the right-hand side of (1.7) is linear in \(u_\varepsilon \), the calculation for that side will remain the same and after integrating the equation over \([0,t]\times {\mathbb {R}}^d\) and applying integration by parts, we get

$$\begin{aligned} \begin{aligned}&\int \limits _{{\mathbb {R}}^{d}}(|\partial _{x_i}u_{\varepsilon }(t,\mathbf{x})|^{2}-|\partial _{x_i}u^\varepsilon _{0}(\mathbf{x})|^{2})\mathrm{d}\mathbf{x}-2\sum _{k=1}^{d}\int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}} \partial _\lambda f^\varepsilon _{k}(\mathbf{x}, u_\varepsilon ) \, \partial _{x_k} u_\varepsilon \, \partial _{x_i}^2 u_{\varepsilon }\mathrm{d}\mathbf{x}\mathrm{d}t\\&\quad -\,2\int \limits _0^t \int \limits _{{\mathbb {R}}^d} \partial _{x_i}^2 u_\varepsilon \, \mathrm {div}{{\mathfrak {f}}}^\varepsilon (\mathbf{x}, \lambda )\big |_{\lambda =u_\varepsilon } \mathrm{d}\mathbf{x}\mathrm{d}t =-2\varepsilon \int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}}|\nabla \partial _{x_i}u_\varepsilon |^{2}\mathrm{d}x\mathrm{d}s\\&\quad -\,\delta \int \limits _{{\mathbb {R}}^{d}}\sum _{j=1}^{d}(|\partial _{x_{i}x_j}u_\varepsilon (t,\mathbf{x})|^{2}-|\partial _{x_{i}x_j}u^\varepsilon _{0}|^{2})\mathrm{d}\mathbf{x}, \end{aligned} \end{aligned}$$

i.e.,

$$\begin{aligned}&\int \limits _{{\mathbb {R}}^{d}}|\partial _{x_i} u_{\varepsilon }(t,\mathbf{x})|^{2} \mathrm{d}\mathbf{x}+2\varepsilon \int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}}|\nabla \partial _{x_i}u_\varepsilon |^{2}\mathrm{d}\mathbf{x}\mathrm{d}t'+\delta \int \limits _{{\mathbb {R}}^{d}} |\nabla \partial _{x_{i}} u_\varepsilon (t,\mathbf{x})|^{2}\mathrm{d}\mathbf{x}\nonumber \\&\quad =\int \limits _{{\mathbb {R}}^{d}}| \partial _{x_i}u^\varepsilon _{0}(\mathbf{x})|^{2}\mathrm{d}\mathbf{x}+\delta \int \limits _{{\mathbb {R}}^{d}} |\nabla \partial _{x_{i}}u^\varepsilon _{0}(\mathbf{x})|^{2}\mathrm{d}\mathbf{x}+2 \int \limits _0^t \int \limits _{{\mathbb {R}}^d} \partial _{x_i}^2 u_\varepsilon \, \mathrm {div}{{\mathfrak {f}}}^\varepsilon (\mathbf{x}, \lambda )\big |_{\lambda =u_\varepsilon } \mathrm{d}\mathbf{x}\mathrm{d}t' \nonumber \\&\qquad +2\sum _{k=1}^{d}\int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}}\partial _\lambda f^\varepsilon _{k}(\mathbf{x},u_\varepsilon ) \partial _{x_k}u_\varepsilon \partial _{x_i}^2 u_{\varepsilon }\mathrm{d}x\mathrm{d}t' \nonumber \\&\quad \le \Vert \partial _{x_i}u_{0}^{\varepsilon }\Vert ^2_{L^{2}({\mathbb {R}}^{d})}+\delta \Vert \nabla \partial _{x_i} u_{0}^{\varepsilon }\Vert ^2_{L^{2}({\mathbb {R}}^{d})} + \frac{\varepsilon }{2} \int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}} |\partial _{x_{i}}^2 u_{\varepsilon }|^2 \mathrm{d}\mathbf{x}\mathrm{d}t'\nonumber \\&\qquad +\frac{2}{\varepsilon } \Vert \mathrm {div}{{\mathfrak {f}}}^\varepsilon (\mathbf{x}, \lambda )\big |_{\lambda =u_\varepsilon }\Vert ^2_{L^2([0,T]\times {\mathbb {R}}^d)} +\frac{d \Vert \partial _\lambda {{\mathfrak {f}}}_\varepsilon \Vert ^2_{\infty }}{\varepsilon } \int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}}|\nabla u_{\varepsilon }|^2 \mathrm{d}\mathbf{x}\mathrm{d}t'\nonumber \\&\qquad +\varepsilon \int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}} |\partial _{x_{i}}^2 u_{\varepsilon }|^2 \mathrm{d}\mathbf{x}\mathrm{d}t' \nonumber \\&\quad \le \Vert \partial _{x_i}u_{0}^{\varepsilon }\Vert ^2_{L^{2}({\mathbb {R}}^{d})}+\delta \Vert \nabla \partial _{x_i} u_{0}^{\varepsilon }\Vert ^2_{L^{2}({\mathbb {R}}^{d})} +\frac{d \Vert \partial _\lambda {{\mathfrak {f}}}_\varepsilon \Vert ^2_{\infty }}{\varepsilon } \Vert \nabla u_{\varepsilon }\Vert ^2_{L^2([0,T],L^2({\mathbb {R}}^d))} \nonumber \\&\qquad + \frac{3\varepsilon }{2} \int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}} |\nabla \partial _{x_{i}} u_{\varepsilon }|^2 \mathrm{d}\mathbf{x}\mathrm{d}t' +\frac{2}{\varepsilon } \Vert \mathrm {div}{{\mathfrak {f}}}^\varepsilon (\mathbf{x}, \lambda )\big |_{\lambda =u_\varepsilon }\Vert ^2_{L^2([0,T]\times {\mathbb {R}}^d)} \end{aligned}$$
(3.11)

where \(\Vert \partial _\lambda {{\mathfrak {f}}}_\varepsilon \Vert _{\infty }=\max _{1\le j\le n}\Vert \partial _{\lambda } f^\varepsilon _{j}\Vert _{L^\infty ({\mathbb {R}}^{d+1})}\). Multiplying the above inequality by \(\delta \), we have

$$\begin{aligned} \begin{aligned}&\delta \int \limits _{{\mathbb {R}}^{d}}|\partial _{x_i}u_{\varepsilon }(t,\mathbf{x})|^{2}\mathrm{d}\mathbf{x}+\frac{\varepsilon \delta }{2} \int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}}|\nabla \partial _{x_i} u_{\varepsilon }|^{2}\mathrm{d}\mathbf{x}\mathrm{d}t + \delta ^2 \int \limits _{{\mathbb {R}}^{d}}|\nabla \partial _{x_{i}}u_{\varepsilon }(t,\mathbf{x})|^{2}\mathrm{d}\mathbf{x}\\&\quad \le \delta \Vert \partial _{x_i}u_{0}^{\varepsilon }\Vert ^2_{L^{2}({\mathbb {R}}^{d})}+\delta ^2 \Vert \nabla \partial _{x_i} u_{0}^{\varepsilon }\Vert ^2_{L^{2}({\mathbb {R}}^{d})} + \frac{\mathrm{d} \delta }{\varepsilon } \Vert \partial _\lambda {{\mathfrak {f}}}_\varepsilon \Vert ^2_{\infty } \Vert \nabla u_{\varepsilon }\Vert ^2_{L^2([0,T],L^2({\mathbb {R}}^d))} \\&\qquad +\frac{2\delta }{\varepsilon } \Vert \mathrm {div}{{\mathfrak {f}}}^\varepsilon (\mathbf{x}, \lambda )\big |_{\lambda =u_\varepsilon }\Vert ^2_{L^2([0,T]\times {\mathbb {R}}^d)} \\&\quad \le \delta \Vert \partial _{x_i}u_{0}^{\varepsilon }\Vert ^2_{L^{2}({\mathbb {R}}^{d})}+\delta ^2 \Vert \nabla \partial _{x_i} u_{0}^{\varepsilon }\Vert ^2_{L^{2}({\mathbb {R}}^{d})} +\frac{2T \delta }{\varepsilon } \Vert \mathrm {div}{{\mathfrak {f}}}^\varepsilon (\mathbf{x}, \lambda )\Vert ^2_{L^2({\mathbb {R}}^d, L^\infty ({\mathbb {R}}))} \\&\qquad +\frac{\mathrm{d}\delta }{2\varepsilon ^2} \Vert \partial _\lambda {{\mathfrak {f}}}_\varepsilon \Vert ^2_{\infty } ( \Vert u_{0}^{\varepsilon }\Vert ^2_{L^{2}({\mathbb {R}}^{d})}+\delta \Vert \nabla u_{0}^{\varepsilon }\Vert ^2_{L^{2}({\mathbb {R}}^{d})} +2 T \Vert \mathrm {div}{{\mathfrak {f}}}_\varepsilon (\mathbf{x},\lambda ) \Vert _{L^{1}({\mathbb {R}}^{d}\times {\mathbb {R}})}), \end{aligned} \end{aligned}$$

where in the last inequality we used Eq. (3.10). From here, (3.7) immediately follows.

We next aim to prove inequality (3.8). To this end, we multiply equation (1.7) by \(\partial _t u_{\varepsilon }\) and then proceed similarly to the above. Namely,

$$\begin{aligned} \begin{aligned}&(\partial _t u_{\varepsilon })^{2}+\sum _{j=1}^{d} \partial _t u_{\varepsilon } \, \partial _\lambda f_j^\varepsilon (\mathbf{x},u_\varepsilon ) \partial _{x_j}u_{\varepsilon } + \partial _t u_{\varepsilon } \, \mathrm {div}{{\mathfrak {f}}}^\varepsilon (\mathbf{x},\lambda )|_{\lambda =u_\varepsilon } \\&\quad =\varepsilon \partial _t u_{\varepsilon }\triangle u_\varepsilon +\delta \partial _t u_{\varepsilon }\, \triangle \partial _t u_{\varepsilon }, \end{aligned} \end{aligned}$$
(3.12)

and we integrate equation (3.12) with respect to time and space.

Integration by part gives

$$\begin{aligned}&\int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}}|\partial _t u_{\varepsilon }|^{2}\mathrm{d}\mathbf{x}\mathrm{d}t'+\sum _{j=1}^{d}\int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}}\partial _t u_{\varepsilon }\partial _\lambda f_j^\varepsilon (\mathbf{x}, u_\varepsilon ) \partial _{x_j} u_{\varepsilon }\mathrm{d}\mathbf{x}\mathrm{d}t' \\&\quad +\,\int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}} \partial _t u_{\varepsilon } \, \mathrm {div}{{\mathfrak {f}}}^\varepsilon (\mathbf{x},\lambda )|_{\lambda =u_\varepsilon } \mathrm{d}\mathbf{x}\mathrm{d}\lambda \\&\quad =\varepsilon \int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}}\triangle u_\varepsilon \, \partial _t u_{\varepsilon } \mathrm{d}\mathbf{x}\mathrm{d}t'+\delta \int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}}\partial _t u_{{\varepsilon }}\triangle \partial _t u_{\varepsilon }\mathrm{d}\mathbf{x}\mathrm{d}t'\\&\quad =-\sum _{j=1}^{d}\int \limits _{0}^{t}\varepsilon \int \limits _{{\mathbb {R}}^{d}}\partial _{x_j} u_{\varepsilon }\partial _{tx_j}u_{\varepsilon } \mathrm{d}\mathbf{x}\mathrm{d}t'-\delta \int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}}|\nabla \partial _t u_{\varepsilon }|^{2}\mathrm{d}\mathbf{x}\mathrm{d}t', \end{aligned}$$

i.e.,

$$\begin{aligned} \begin{aligned}&\int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}} |\partial _t u_{\varepsilon }|^{2}\mathrm{d}\mathbf{x}\mathrm{d}t'+\delta \int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}}|\nabla \partial _t u_{\varepsilon }|^{2}\mathrm{d}\mathbf{x}\mathrm{d}t'\\&\quad =-\varepsilon \sum _{j=1}^{d}\int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}}\partial _{x_j} u_{\varepsilon } \partial _{tx_j} u_{\varepsilon }\mathrm{d}\mathbf{x}\mathrm{d}t'\\&\qquad - \sum _{j=1}^{d}\int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}}\left( \partial _t u_{\varepsilon }\partial _\lambda f_j^\varepsilon (\mathbf{x}, u_\varepsilon ) \partial _{x_j} u_{\varepsilon } - \partial _t u_{\varepsilon } \, \partial _{x_j} f_j^\varepsilon (\mathbf{x},\lambda )|_{\lambda =u_\varepsilon } \right) \mathrm{d}\mathbf{x}\mathrm{d}t' \\&\quad \le \Vert \partial _\lambda {{\mathfrak {f}}}\Vert _{\infty }\sum _{j=1}^{d}\int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}}|\partial _t u_{\varepsilon }| \, |\partial _{x_j} u_{\varepsilon }|\mathrm{d}\mathbf{x}\mathrm{d}t' \\&\qquad + \Vert \partial _t u_\varepsilon \Vert _{L^2([0,T]\times {\mathbb {R}}^d)} \sqrt{T} \Vert \mathrm {div}{{\mathfrak {f}}}_\varepsilon \Vert _{L^2({\mathbb {R}}^d,L^\infty ({\mathbb {R}}))} -\frac{\varepsilon }{2}\sum _{j=1}^{d}\int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}}(|\partial _{x_j} u_{\varepsilon }|^{2})_t \mathrm{d}\mathbf{x}\mathrm{d}t' \\&\quad \le \frac{3}{4}\int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}}|\partial _t u_{\varepsilon }|^{2} \mathrm{d}\mathbf{x}\mathrm{d}t'+\frac{d \Vert \partial _\lambda {{\mathfrak {f}}}_\varepsilon \Vert _{\infty }^{2}}{2}\int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}}|\nabla u_\varepsilon |^{2}\mathrm{d}\mathbf{x}\mathrm{d}t' \\&\qquad +T \Vert \mathrm {div}{{\mathfrak {f}}}_\varepsilon \Vert ^2_{L^2({\mathbb {R}}^d, L^\infty ({\mathbb {R}}))} -\frac{\varepsilon }{2}\int \limits _{{\mathbb {R}}^{d}}|\nabla u_\varepsilon (t)|^{2}\mathrm{d}x+\frac{\varepsilon }{2}\int \limits _{{\mathbb {R}}^{d}}|\nabla u^\varepsilon _{0}|^{2}\mathrm{d}\mathbf{x}. \end{aligned} \end{aligned}$$
(3.13)

By multiplying (3.13) by \(\delta \) and taking into account (3.5), we conclude

$$\begin{aligned}&\frac{\delta }{4}\int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}}|\partial _t u_{\varepsilon }|^{2} \mathrm{d}\mathbf{x}\mathrm{d}t'+\frac{\varepsilon \delta }{2}\int \limits _{{\mathbb {R}}^{d}}|\nabla u_\varepsilon (t)|^{2}\mathrm{d}\mathbf{x}+\delta ^{2} \int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}}|\nabla \partial _t u_{\varepsilon }|^{2}\mathrm{d}\mathbf{x}\mathrm{d}t' \\&\quad \le \frac{\delta d\Vert \partial _\lambda {{\mathfrak {f}}}_\varepsilon \Vert _{\infty }^{2}}{2}\int \limits _{0}^{t}\int \limits _{{\mathbb {R}}^{d}}|\nabla u_\varepsilon |^{2}\mathrm{d}x\mathrm{d}t'+\frac{\delta \varepsilon }{2}\int \limits _{{\mathbb {R}}^{d}}|\nabla u^\varepsilon _{0}|^{2} \mathrm{d}\mathbf{x}+\delta T \Vert \mathrm {div}{{\mathfrak {f}}}_\varepsilon \Vert ^2_{L^2({\mathbb {R}}^d, L^\infty ({\mathbb {R}}))} \nonumber \\&\quad \le \frac{\delta d\Vert \partial _\lambda {{\mathfrak {f}}}_\varepsilon \Vert _{\infty }^{2}}{4\varepsilon } \Vert u^\varepsilon _{0}\Vert _{L^{2}({\mathbb {R}}^{d})}^{2}+ \left( \frac{\delta ^2 d\Vert \partial _\lambda {{\mathfrak {f}}}_\varepsilon \Vert _{\infty }^{2}}{4\varepsilon }+\frac{\delta \varepsilon }{2}\right) \Vert \nabla u^\varepsilon _{0}\Vert _{L^{2}({\mathbb {R}}^{d})}^{2} \nonumber \\&\qquad +\frac{\delta d\Vert \partial _\lambda {{\mathfrak {f}}}_\varepsilon \Vert _{\infty }^{2}T}{2\varepsilon } \Vert \mathrm {div}{{\mathfrak {f}}}_\varepsilon \Vert _{L^1({\mathbb {R}}^d\times {\mathbb {R}})}+\delta T \Vert \mathrm {div}{{\mathfrak {f}}}_\varepsilon \Vert ^2_{L^2({\mathbb {R}}^d,L^\infty ({\mathbb {R}}))},\nonumber \end{aligned}$$
(3.14)

which gives the inequality (3.8). \(\square \)

We shall now derive the kinetic formulation for (1.7) and then show that we can apply the velocity averaging result on this form of the equation.

Lemma 3.5

(Kinetic formulation of (1.7)) Assume that the function \(u_\varepsilon \) satisfies (1.7). Then, the function \(h_\varepsilon (t,\mathbf{x},\lambda )=\mathrm{sgn}(u_\varepsilon (t,\mathbf{x})-\lambda )\) solves a linear PDE of the form

$$\begin{aligned}&\partial _t h_\varepsilon +\mathrm {div}(h_\varepsilon \partial _\lambda {{\mathfrak {f}}}_\varepsilon (\mathbf{x},\lambda ) )-\partial _\lambda (h_\varepsilon \mathrm {div}{{\mathfrak {f}}}_\varepsilon (\mathbf{x},\lambda ))=\partial _\lambda \mathrm {div}(G_1^\varepsilon )+ \partial ^2_\lambda G^\varepsilon _2, \end{aligned}$$
(3.15)

where

  1. (i)

    \(\mathrm {div}G_1^\varepsilon \rightarrow 0\) in \(W^{-1,r}_{loc}({\mathbb {R}}^+\times {\mathbb {R}}^d\times {\mathbb {R}})\) for any \(r\in [1,2)\);

  2. (ii)

    \((G^\varepsilon _2)\) is bounded in the space of measures \({{{\mathcal {M}}}}_{loc}({\mathbb {R}}^+\times {\mathbb {R}}^d \times {\mathbb {R}})\) and thus strongly precompact in \(W_{loc}^{-1,r}({\mathbb {R}}^+\times {\mathbb {R}}^d \times {\mathbb {R}})\) for any \(r\in [1,\frac{d+2}{d+1})\).

Proof

Before we start, notice that we require merely local convergence (see conditions (i) and (ii) of the lemma), which means that we can always choose \(\varepsilon \) small enough so that \(K_\varepsilon \equiv 1\) on the fixed relatively compact set on which we derive the estimates.

Therefore, it is enough to prove the lemma under the simplifying assumption

$$\begin{aligned} {{\mathfrak {f}}}_\varepsilon ={{\mathfrak {f}}}\star \omega _{n(\varepsilon )}. \end{aligned}$$

Also, in order to avoid proliferation of symbols, we will notationally adhere to integration over the entire space \({\mathbb {R}}^d\times {\mathbb {R}}\), with the implicit understanding we are in fact integrating over the support of the corresponding test function (see (3.19)).

Then, note that for any \(\eta \in C^3({\mathbb {R}})\) such that \(\eta '\in C^2_c({\mathbb {R}})\) we have (recall that \(h_\varepsilon (t,\mathbf{x},\lambda )=\mathrm{sgn}(u_\varepsilon (t,\mathbf{x})-\lambda )\))

$$\begin{aligned} \eta (u_\varepsilon ) = \frac{1}{2} \int \limits _{\mathbb {R}}\eta '(\lambda ) h_\varepsilon (t,\mathbf{x},\lambda ) \mathrm{d}\lambda +\frac{\eta (-R)+\eta (R)}{2} \end{aligned}$$
(3.16)

for \(R>0\) such that \(\mathrm{supp}(\eta ')\subset [-R,R]\). Moreover, for any function \(f\in L^1({\mathbb {R}})\)

$$\begin{aligned} \int \limits _{-\infty }^{u_\varepsilon } f(\lambda ) \mathrm{d}\lambda =\frac{1}{2}\int \limits _{\mathbb {R}}f(\lambda ) h_\varepsilon (t,\mathbf{x},\lambda ) \mathrm{d}\lambda +\frac{1}{2}\int \limits _{\mathbb {R}}f(\lambda ) \mathrm{d}\lambda . \end{aligned}$$
(3.17)

To proceed, we multiply equation (1.7) by \(\eta '(u_\varepsilon )\). We get

$$\begin{aligned}&\partial _{t} \eta (u^{\varepsilon })+\mathrm {div}\int \limits _{-\infty }^{u_\varepsilon } \eta '(\lambda ) \partial _\lambda {{\mathfrak {f}}}_\varepsilon (\mathbf{x},\lambda ) \mathrm{d}\lambda +\int \limits _{-\infty }^{u_\varepsilon } \eta ''(\lambda )\mathrm {div}{{\mathfrak {f}}}_\varepsilon (\mathbf{x},\lambda ) \mathrm{d}\lambda \\&\quad =\sum _{j=1}^{d}\partial _{x_j}\left( \varepsilon \eta '(u_{\varepsilon })\partial _{x_j}u_{\varepsilon } +\delta \eta '(u_{\varepsilon })\partial _{t,x_{j}}u_{\varepsilon }\right) \nonumber \\&\qquad -\varepsilon \eta ''(u_{\varepsilon })\sum _{j=1}^{d}|\partial _{{x_j}}u_\varepsilon |^{2} -\delta \eta ''(u_\varepsilon )\sum _{j=1}^{d}\partial _{x_{j}}u_{\varepsilon }\partial _{t,x_{j}}u_{\varepsilon }. \nonumber \end{aligned}$$
(3.18)

Now, we apply (3.16) and (3.17) and rewrite the latter equation in the form

$$\begin{aligned}&\frac{1}{2} \partial _{t} \int \limits _{{\mathbb {R}}} \eta '(\lambda ) h_\varepsilon \mathrm{d}\lambda +\mathrm {div}\int \limits _{\mathbb {R}}\eta '(\lambda ) \partial _\lambda {{\mathfrak {f}}}_\varepsilon (\mathbf{x},\lambda ) h_\varepsilon \mathrm{d}\lambda +\int \limits _{\mathbb {R}}\eta ''(\lambda )\mathrm {div}{{\mathfrak {f}}}_\varepsilon (\mathbf{x},\lambda ) h_\varepsilon \mathrm{d}\lambda \\&\quad =\sum _{j=1}^{d}\partial _{x_j}\left( \varepsilon \int \limits _{\mathbb {R}}\eta ''(\lambda ) h_\varepsilon \mathrm{d}\lambda \partial _{x_j}u_{\varepsilon }+\delta \int \limits _{\mathbb {R}}\eta ''(\lambda ) h_\varepsilon \mathrm{d}\lambda \partial _{t,x_{j}}u_{\varepsilon }\right) \\&\qquad -\varepsilon \int \limits _{\mathbb {R}}\eta '''(\lambda ) h_\varepsilon \mathrm{d}\lambda \sum _{j=1}^{d}|\partial _{{x_j}}u_\varepsilon |^{2}-\delta \int \limits _{\mathbb {R}}\eta '''(\lambda ) h_\varepsilon \mathrm{d}\lambda \sum _{j=1}^{d}\partial _{x_{j}}u_{\varepsilon }\partial _{t,x_{j}}u_{\varepsilon }. \end{aligned}$$

If we rewrite the latter in the variational formulation, we get for every test function \(\phi \in C_c^2({\mathbb {R}}^+\times {\mathbb {R}}^d)\)

$$\begin{aligned}&-\int \limits _{{\mathbb {R}}^+\times {\mathbb {R}}^d \times {\mathbb {R}}} \left( \frac{1}{2}\partial _{t}\phi (t,\mathbf{x}) \eta '(\lambda ) + \nabla \phi \eta '(\lambda ) \cdot \partial _\lambda {{\mathfrak {f}}}_\varepsilon (\mathbf{x},\lambda )\right) h_\varepsilon \mathrm{d}t \mathrm{d}\mathbf{x}\mathrm{d}\lambda \\&\quad +\,\int \limits _{{\mathbb {R}}^+\times {\mathbb {R}}^d \times {\mathbb {R}}} \eta ''(\lambda ) \, \phi \, \mathrm {div}{{\mathfrak {f}}}_\varepsilon (\mathbf{x},\lambda ) h_\varepsilon \mathrm{d}t \mathrm{d}\mathbf{x}\mathrm{d}\lambda \nonumber \\&\quad =-\sum _{j=1}^{d}\int \limits _{{\mathbb {R}}+\times {\mathbb {R}}^d \times {\mathbb {R}}}h_\varepsilon \partial _{x_j}\phi \eta ''(\lambda ) \left( \varepsilon \partial _{x_j}u_{\varepsilon }+\delta \partial _{t,x_{j}}u_{\varepsilon }\right) \mathrm{d}t \mathrm{d}\mathbf{x}\mathrm{d}\lambda \nonumber \\&\qquad - \int \limits _{{\mathbb {R}}^+\times {\mathbb {R}}^d \times {\mathbb {R}}}\phi (t,\mathbf{x}) \eta '''(\lambda ) h_\varepsilon \left( \varepsilon \sum _{j=1}^{d}|\partial _{{x_j}}u_\varepsilon |^{2}-\delta \sum _{j=1}^{d}\partial _{x_{j}}u_{\varepsilon }\partial _{t,x_{j}}u_{\varepsilon } \right) \mathrm{d}t \mathrm{d}\mathbf{x}\mathrm{d}\lambda \nonumber \\&\quad =:-\int \limits _{{\mathbb {R}}^+\times {\mathbb {R}}^d \times {\mathbb {R}}} \left( \Gamma _{1}^{\varepsilon }+\Gamma _{2}^{\varepsilon }\right) \cdot \nabla \phi (t,\mathbf{x}) \eta ''(\lambda )\mathrm{d}t \mathrm{d}\mathbf{x}\mathrm{d}\lambda \nonumber \\&\qquad -\int \limits _{{\mathbb {R}}^+\times {\mathbb {R}}^d \times {\mathbb {R}}} \left( \Gamma _{3}^{\varepsilon }+\Gamma _{4}^{\varepsilon }\right) \phi (t,\mathbf{x}) \eta '''(\lambda ) \mathrm{d}t \mathrm{d}\mathbf{x}\mathrm{d}\lambda .\nonumber \end{aligned}$$
(3.19)

Now, since \(\eta '\in C^2_c({\mathbb {R}})\) is arbitrary, if we put

$$\begin{aligned} G_1^\varepsilon =\Gamma _{1}^{\varepsilon }+\Gamma _{2}^{\varepsilon } \ \ \mathrm{and} \ \ G_2^\varepsilon =\Gamma _{3}^{\varepsilon }+\Gamma _{4}^{\varepsilon } \end{aligned}$$
(3.20)

we arrive at (3.15).

Next, we prove (i) and (ii). To begin with, we show that \(\mathrm {div}\Gamma _{1}^{\varepsilon }\rightarrow 0\text{ as } \varepsilon \rightarrow 0^{+}\) in \(H^{-1,r}_{loc}({\mathbb {R}}^+\times {\mathbb {R}}^d\times {\mathbb {R}})\), \(r\le 2\).

We have for any \(\phi \in C_c^{1}({\mathbb {R}}^+\times {\mathbb {R}}^d \times {\mathbb {R}})\) supported in the hypercube with side length M centered at zero:

$$\begin{aligned} \begin{aligned} \left| \left\langle \mathrm {div}\Gamma _{1}^{\varepsilon },\phi \right\rangle \right|&= \left| \sum _{j=1}^{d}\int \limits _{0}^{T}\int \limits _{{\mathbb {R}}^{d+1}}\varepsilon h_\varepsilon u_{x_j}^{\varepsilon }\phi _{x_j}(t,\mathbf{x}, \lambda )\mathrm{d}\mathbf{x}\mathrm{d}\lambda \mathrm{d}t\right| \\&\le \varepsilon M^{1/2} \Vert \phi \Vert _{H^1({\mathbb {R}}^+\times {\mathbb {R}}^d \times {\mathbb {R}})} \, \Vert \nabla u^\varepsilon \Vert _{L^2([0,T]\times {\mathbb {R}}^d)}. \end{aligned} \end{aligned}$$
(3.21)

To proceed further, we use (3.5). For estimating the term \(\Vert \mathrm {div}({{\mathfrak {f}}}_\varepsilon )\Vert _{L^1({\mathbb {R}}^{d+1})}\) appearing in (3.5), we use approximations of the form \({{\mathfrak {f}}}_\varepsilon :={{\mathfrak {f}}}\star \omega _{n(\varepsilon )}\) as in (1.8) with \(n(\varepsilon )\) satisfying (1.14). Since \(\delta =o(\varepsilon ^2)\), the \({{\mathfrak {f}}}_\varepsilon \) will converge to \({{\mathfrak {f}}}\) as \(\varepsilon \rightarrow 0\) in \(L^r_{loc}({\mathbb {R}}^d\times {\mathbb {R}})\) for any \(1\leqslant r \leqslant p\). Using (C2) and (C3), we get

$$\begin{aligned} \begin{aligned} \Vert \mathrm {div}({{\mathfrak {f}}}_\varepsilon )\Vert _{L^1({\mathbb {R}}^{d+1})}&= \Vert |\mathrm {div}({{\mathfrak {f}}})|\star \omega _{n(\varepsilon )}\Vert _{L^1({\mathbb {R}}^{d+1})} \\&\le \left\| \frac{\mu (\mathbf{x})}{1+|\lambda |^{1+\beta }}\star \omega _{n(\varepsilon )}(\mathbf{x},\lambda ) \right\| _{L^1({\mathbb {R}}^{d+1})}\\&\le C \mu ({\mathbb {R}}^{d}) \Vert \omega _{n(\varepsilon )} \Vert _{L^1({\mathbb {R}}^{d+1})} \le C, \end{aligned} \end{aligned}$$
(3.22)

where here and below, C denotes a generic constant that may alter from line to line. Thus (keeping in mind (3.5) and (1.11)), from (3.22) we conclude

$$\begin{aligned} \left| \left\langle \mathrm {div}\Gamma _{1}^{\varepsilon },\phi \right\rangle \right| \le \sqrt{\varepsilon } (n(\varepsilon )^{-1}\sqrt{\delta }+1)C \Vert \phi \Vert _{H^{1}({\mathbb {R}}^+\times {\mathbb {R}}^d \times {\mathbb {R}}})\rightarrow 0^{+},as\varepsilon \rightarrow 0^{+} \end{aligned}$$
(3.23)

by (1.14).

For the next estimate, we have

$$\begin{aligned} \begin{aligned} \left| \left\langle \mathrm {div}\Gamma _{2}^{\varepsilon },\phi \right\rangle \right| \le 2 M^{1/2} \Vert \phi \Vert _{H_{0}^{1}([0,T]\times {\mathbb {R}}^{d+1})} \delta \Vert \nabla u^{\varepsilon }_{t}\Vert _{L^{2}([0,T],L^{2}({\mathbb {R}}^{d}))}. \end{aligned} \end{aligned}$$
(3.24)

Comparing with (3.8), we see that it remains to estimate the terms \(\Vert \partial _\lambda {{\mathfrak {f}}}_\varepsilon \Vert _{\infty } \) and \(\Vert \mathrm {div}({{\mathfrak {f}}}_\varepsilon )\Vert _{L^2({\mathbb {R}}^{d},L^\infty ({\mathbb {R}}))}\). For the first one, we have

$$\begin{aligned} \begin{aligned} \Vert \partial _\lambda {{\mathfrak {f}}}_\varepsilon \Vert _{\infty }&= \sup _{i} \Vert (\partial _\lambda f_i)\star \omega _{n(\varepsilon )}\Vert _{L^\infty ({\mathbb {R}}^{d+1})}\\&= \sup _{i} \Vert (\partial _\lambda f_i)\star \omega _{n(\varepsilon )}\Vert _{L^\infty ({\mathbb {R}}^{d}, L^\infty ({\mathbb {R}}))} \\&\le \sup _{i} \Vert \partial _\lambda f_i\Vert _{L^1({\mathbb {R}}^{d}, L^\infty ({\mathbb {R}}))} \Vert \omega _{n(\varepsilon )} \Vert _{L^\infty ({\mathbb {R}}^{d+1})} \le C n(\varepsilon )^{-d-1} \end{aligned} \end{aligned}$$
(3.25)

by (C3). For the second one, we obtain from (C2) and the Young inequality

$$\begin{aligned} \begin{aligned} \Vert \mathrm {div}({{\mathfrak {f}}}_\varepsilon )\Vert _{L^2({\mathbb {R}}^{d},L^\infty ({\mathbb {R}}))}&\le \Vert \frac{\mu (\mathbf{x})}{1+|\lambda |^{1+\beta }}\star \omega _{n(\varepsilon )}(\mathbf{x},\lambda ) \Vert _{L^2({\mathbb {R}}^{d},L^\infty ({\mathbb {R}}))}\\&= \Vert \mu \star \omega _{n(\varepsilon )}^{(1)}(\mathbf{x})\frac{1}{1+|\lambda |^{1+\beta }}\star \omega ^{(2)}_{n(\varepsilon )}(\lambda ) \Vert _{L^2({\mathbb {R}}^{d},L^\infty ({\mathbb {R}}))} \\&\le C n(\varepsilon )^{-1}\Vert \mu \star \omega ^{(1)}_{n(\varepsilon )} \Vert _{L^2({\mathbb {R}}^d)}\\&\le C n(\varepsilon )^{-1}\mu ({\mathbb {R}}^d) \Vert \omega ^{(1)}_{n(\varepsilon )} \Vert _{L^2({\mathbb {R}}^d)} \le C n(\varepsilon )^{-d/2-1}. \end{aligned} \end{aligned}$$
(3.26)

Combining (3.24), (3.25), (3.8), (3.22), and (1.11), we see that

$$\begin{aligned}&|\langle \mathrm {div}\Gamma _{2}^{\varepsilon },\phi \rangle | \nonumber \\&\quad \le C \delta ^{1/2} \left( \frac{1}{\sqrt{\varepsilon } n(\varepsilon )^{d+1}} + \Big (\frac{\sqrt{\delta }}{\sqrt{\varepsilon }n(\varepsilon )^{d+1}}+\sqrt{\varepsilon }\Big ) \frac{1}{n(\varepsilon )} + \frac{1}{\sqrt{\varepsilon }n(\varepsilon )^{d+1}} +\frac{1}{n(\varepsilon )^{d/2 + 1}} \right) \nonumber \\&\qquad \cdot \Vert \phi \Vert _{H_{0}^{1}([0,T]\times {\mathbb {R}}^{d+1})} \rightarrow 0^{+}as\varepsilon \rightarrow 0^{+}. \end{aligned}$$
(3.27)

by (1.14). This implies claim (i) of the current lemma. Next, using (3.5), (3.8), (1.11), (3.22), and (3.25), we estimate

$$\begin{aligned} \begin{aligned} \left| \left\langle \Gamma _{4}^{\varepsilon },\phi \right\rangle \right|&\le \delta \left| \sum _{j=1}^{d}\int \limits _{0}^{T}\int \limits _{[-M,M]^{d+1}}|u_{x_j}^{\varepsilon }| \, |u_{t,x_{j}}^{\varepsilon }| \, |\phi (t,\mathbf{x},\lambda )|\mathrm{d}\mathbf{x}\mathrm{d}t \mathrm{d}\lambda \right| \\&\le M \delta \Vert \phi \Vert _{\infty }\sum _{j=1}^{d}\int \limits _{0}^{T}\int \limits _{{\mathbb {R}}^{d}}| u_{x_j}^{\varepsilon }||u_{t,x_{j}}^{\varepsilon }|\mathrm{d}x \mathrm{d}t\\&\le M \delta \Vert \phi \Vert _{\infty } \Vert \nabla u^{\varepsilon }\Vert _{L^{2}([0,T]\times {\mathbb {R}}^{d})} \Vert \nabla u_{t}^{\varepsilon }\Vert _{L^{2}([0,T]\times {\mathbb {R}}^{d})}\\&\leqslant C \sqrt{\delta } \Vert \phi \Vert _\infty \Big (\frac{1}{\sqrt{\varepsilon }} + \frac{{\sqrt{\delta }}}{\sqrt{\varepsilon } n(\varepsilon )}\Big ) \Big [\frac{1}{\sqrt{\varepsilon }n(\varepsilon )^{d+1}}+ \\&\quad +\,\Big (\frac{\sqrt{\delta }}{\sqrt{\varepsilon }n(\varepsilon )^{d+1}} +\sqrt{\varepsilon }\Big )\frac{1}{n(\varepsilon )} + \frac{1}{\sqrt{\varepsilon }n(\varepsilon )^{d+1}} + \frac{1}{n(\varepsilon )^{d/2+1}} \Big ], \end{aligned} \end{aligned}$$
(3.28)

which goes to 0 as \(\varepsilon \rightarrow 0+\) by (1.14).

Finally, it remains to estimate \(\Gamma _3^\varepsilon \). Using (3.5), (1.11), and (3.22), we obtain

$$\begin{aligned} \begin{aligned} \left| \left\langle \Gamma _{3}^{\varepsilon },\phi \right\rangle \right|&\le \varepsilon \left| \sum _{j=1}^{d}\int \limits _{0}^{T}\int \limits _{[-M,M]^{d+1}} |u_{x_j}^{\varepsilon }(t,\mathbf{x})|^2 \phi (t,\mathbf{x},\lambda ) \, \mathrm{d}\mathbf{x}\mathrm{d}t \mathrm{d}\lambda \right| \\&\le \varepsilon M \Vert \phi \Vert _{\infty } \Vert \nabla u^{\varepsilon }\Vert ^2_{L^{2}([0,T]\times {\mathbb {R}}^{d})} \\&\le C \Vert \phi \Vert _{\infty } (1+\frac{\delta }{n(\varepsilon )^2}) \le C \Vert \phi \Vert _{\infty }. \end{aligned} \end{aligned}$$
(3.29)

From the estimates given above, we see that \(\mathrm {div}\Gamma _1^\varepsilon \) and \(\mathrm {div}\Gamma _2^\varepsilon \) converge strongly to zero in \(H^{-1}_{loc}({\mathbb {R}}^+\times {\mathbb {R}}^d \times {\mathbb {R}})\) implying that they converge strongly in \(W^{-1,r}_{loc}({\mathbb {R}}^+\times {\mathbb {R}}^d \times {\mathbb {R}})\), \(r\in [1,2)\), as well.

As for \(\Gamma _3^\varepsilon \) and \(\Gamma _4^\varepsilon \), they are locally bounded in the space of Radon measures (which we denote it by \({{{\mathcal {M}}}}({\mathbb {R}}^+\times {\mathbb {R}}^d \times {\mathbb {R}})\)) and thus, they are precompact in \(W^{-1,r}_{loc}({\mathbb {R}}^+\times {\mathbb {R}}^d \times {\mathbb {R}})\), \(r\in \big [1,\frac{d+2}{d+1}\big )\) [18, Th. 1.3.2]. This implies condition (ii) from the Lemma. \(\square \)

According to the previous theorem, we see that we can apply the velocity averaging lemma Theorem 3.3. Indeed, we have:

Lemma 3.6

Any solution \(h_\varepsilon \) of (3.15) satisfies

$$\begin{aligned} \partial _t h_\varepsilon +\mathrm {div}(h_\varepsilon \partial _\lambda {{\mathfrak {f}}}(\mathbf{x},\lambda ) )=\partial _\lambda \mathrm {div}(G_1^\varepsilon )+ \partial ^2_\lambda G^\varepsilon _2 + \mathrm {div}(G^\varepsilon _3)+\partial _\lambda (h_\varepsilon \mathrm {div}{{\mathfrak {f}}}_\varepsilon (\mathbf{x},\lambda )), \end{aligned}$$
(3.30)

where \((G_j^\varepsilon )\), \(j=1,3\), are strongly precompact in \(L_{loc}^r({\mathbb {R}}^+\times {\mathbb {R}}^d \times {\mathbb {R}})\) while \((G^\varepsilon _2)\) and \((h_\varepsilon \mathrm {div}{{\mathfrak {f}}}_\varepsilon (\mathbf{x},\lambda ))\) are strongly precompact in \(W^{-1,r}_{loc}({\mathbb {R}}^+\times {\mathbb {R}}^d \times {\mathbb {R}})\), \(r\in \big [1,\frac{d+2}{d+1}\big )\).

Proof

It is enough to rewrite (3.15) in the form

$$\begin{aligned} \begin{aligned} \frac{1}{2}\partial _t h_\varepsilon +\mathrm {div}(h_\varepsilon \partial _\lambda {{\mathfrak {f}}}(\mathbf{x},\lambda ) )&=\partial _\lambda \mathrm {div}(G_1^\varepsilon )+ \partial ^2_\lambda G^\varepsilon _2 +\mathrm {div}(h_\varepsilon (\partial _\lambda {{\mathfrak {f}}}(\mathbf{x},\lambda )-\partial _\lambda {{\mathfrak {f}}}_\varepsilon (\mathbf{x},\lambda )))\\&\quad +\,\partial _\lambda (h_\varepsilon \mathrm {div}{{\mathfrak {f}}}_\varepsilon (\mathbf{x},\lambda )), \end{aligned} \end{aligned}$$

and denote

$$\begin{aligned} G_3^\varepsilon =h_\varepsilon (\partial _\lambda {{\mathfrak {f}}}(\mathbf{x},\lambda )-\partial _\lambda {{\mathfrak {f}}}_\varepsilon (\mathbf{x},\lambda )). \end{aligned}$$

Clearly, since \({{\mathfrak {f}}}_\varepsilon =K_\varepsilon \cdot {{\mathfrak {f}}}\star \omega _{n(\varepsilon )}\), we have \(G_3^\varepsilon \rightarrow 0\) as \(\varepsilon \rightarrow 0\) in \(L_{loc}^p({\mathbb {R}}^d\times {\mathbb {R}})\) (were p is given in (C1)). On the other hand, according to condition (C2), we see that \((h_\varepsilon \mathrm {div}{{\mathfrak {f}}}_\varepsilon (\mathbf{x},\lambda )))\) is bounded in the space of measures and thus strongly precompact in \( W^{-1,r}_{loc}({\mathbb {R}}^+\times {\mathbb {R}}^d \times {\mathbb {R}})\). This concludes the proof. \(\square \)

Now, we are ready to use Theorem 3.3. In fact, we may choose \(r\in \big (1,\frac{d+2}{d-1}\big )\) and \(s\geqslant 2\) in such a way that

$$\begin{aligned} 1+\frac{1}{s} - \frac{1}{r} <1 \end{aligned}$$

and then set \({{\bar{p}}} := (1+\frac{1}{s} - \frac{1}{r})^{-1}\). Then, fixing \({\bar{p}}'\) such that \(\frac{1}{{{\bar{p}}}} + \frac{1}{{{\bar{p}}}'}=1\), the assumptions of Theorem 3.1 precisely allow to apply Theorem 3.3 (noting that \((h_\varepsilon )\) is bounded in any \(L^s_{loc}\), hence contains a weakly convergent subsequence). More precisely, one easily checks that any \({{\bar{p}}}\in \big (\frac{2d+4}{d+4},\infty \big )\), and thereby any \({{\bar{p}}}' \in \big (1,2+\frac{4}{d}\big )\) can be obtained in this way. This is the reason for the specific assumption in Theorem 3.1.

Consequently, for any sequence \(\varepsilon _n\searrow 0\) and any \(\rho \in L^2({\mathbb {R}})\), setting \(h_n:=h_{\varepsilon _n}\) and \(u_n:=u_{\varepsilon _n}\), the sequence \(\left( \int \limits _{\mathbb {R}}\rho (\lambda ) h_n(t,\mathbf{x},\lambda ) \mathrm{d}\lambda \right) \) is strongly precompact in \(L^1_{loc}({\mathbb {R}}^+\times {\mathbb {R}}^d)\). As we shall see in the next theorem, this implies strong convergence of the sequence \((u_n)\). We first need the following auxiliary result.

Lemma 3.7

Assume that the sequence \((u_n)\) is bounded in \(L^p(\Omega )\), \(\Omega \Subset {\mathbb {R}}^d\), \(d\in {\mathbb {N}}\), for some \(p\ge 1\). Define

$$\begin{aligned} \Omega _n^l=\{ \mathbf{x}\in \Omega :\, |u_{n}(\mathbf{x})| > l \}. \end{aligned}$$

Then,

$$\begin{aligned} \lim \limits _{l\rightarrow \infty } \sup \limits _{n\in {\mathbb {N}}} {{\mathrm{meas}}}(\Omega _n^l) =0. \end{aligned}$$
(3.31)

Proof

Since \((u_{n})\) is bounded in \(L^p(\Omega )\) and thus in \(L^1(\Omega )\) as well (since \(\Omega \Subset {\mathbb {R}}^m\)), we have

$$\begin{aligned}&\sup \limits _{n\in {\mathbb {N}}} \int \limits _{\Omega }|u_{n}(\mathbf{x})| \mathrm{d}\mathbf{x}\ge \sup \limits _{n\in {\mathbb {N}}} \int \limits _{\Omega _n^l} l \mathrm{d}\mathbf{x}\, \implies \frac{1}{l}\sup \limits _{k\in {\mathbb {N}}} \int \limits _{\Omega }|u_{n}(\mathbf{x})| \mathrm{d}\mathbf{x}\ge \sup \limits _{n\in {\mathbb {N}}} \mathrm{meas}(\Omega _n^l), \end{aligned}$$

implying (3.31) after letting \(l\rightarrow \infty \) here. \(\square \)

With the above notations, we finally arrive at:

Theorem 3.8

Under the non-degeneracy conditions (3.1), the sequence \((u_n=u_{\varepsilon _n})\) of solutions to (1.7), (1.10) strongly converges along a subsequence in \(L^1_{loc}({\mathbb {R}}^+\times {\mathbb {R}}^d)\).

Proof

For \(l\in {\mathbb {N}}\) we set \(\rho (\lambda )=\chi _{(-l,l)}(\lambda )\), where \(\chi _{(-l,l)}\) is the characteristic function of the interval \((-l,l)\). Then, by Theorem 3.3 combined with a diagonalization argument there exists a common subsequence, again denoted by \((u_n)\), such that for any \(l\in {\mathbb {N}}\):

$$\begin{aligned} \frac{1}{2}\int \limits _{-l}^l h_n(t,\mathbf{x},\lambda ) \mathrm{d}\lambda&=u_n \chi _{\{|u_n|\le l\}}(t,\mathbf{x})+l \chi _{\{u_n>l\}}(t,\mathbf{x})-l \chi _{\{u_n<-l\}}(t,\mathbf{x})\nonumber \\&=:T_l(u_n) \rightarrow u^l \end{aligned}$$
(3.32)

as \(n\rightarrow \infty \) in \(L^1_{loc}({\mathbb {R}}^+\times {\mathbb {R}})\). The operators \(T_l\) are known as truncation operators [17].

It is not difficult to prove that from here we can conclude about the convergence of \((u_n)\). First, we show that the sequence \((u^l)\) converges strongly in \(L^1_{loc}({\mathbb {R}}^+\times {\mathbb {R}})\) as \(l\rightarrow \infty \).

To this end, let \(\Omega \Subset {\mathbb {R}}^+\times {\mathbb {R}}^d\). We claim that

$$\begin{aligned} \lim _l \sup _n \Vert {T_l(u_{n}) - u_{n}}\Vert _{L^1(\Omega )} \rightarrow 0\,. \end{aligned}$$
(3.33)

In fact, let

$$\begin{aligned} \Omega _n^l=\{ (t,\mathbf{x})\in \Omega :\, |u_{n}(t,\mathbf{x})| > l \}. \end{aligned}$$

Then, since (by (3.4), (1.11), (1.14), and (3.22)) \((u_{n})\) is bounded in \(L^2(\Omega )\), we have

$$\begin{aligned} \int \limits _\Omega |u_{n}-T_l(u_{n})| \mathrm{d}t \mathrm{d}\mathbf{x}\le \int \limits _{\Omega _n^l} |u_{n}|\mathrm{d}t \mathrm{d}\mathbf{x}\le \mathrm{meas}(\Omega _n^l)^{1/2} \, \Vert u_n\Vert _{L^2(\Omega )}\rightarrow 0 \end{aligned}$$

as \(l\rightarrow \infty \), uniformly with respect to n according to (3.31). This gives (3.33).

Next, we estimate

$$\begin{aligned} \Vert u^{l_1}-u^{l_2}\Vert _{L^1( \Omega )}&\le \Vert u^{l_1}-T_{l_1}(u_n)\Vert _{L^1( \Omega )}+\Vert T_{l_1}(u_n)-u_n\Vert _{L^1( \Omega )}\\&\quad +\,\Vert T_{l_2}(u_n)-u_n\Vert _{L^1( \Omega )}+\Vert T_{l_2}(u_n)-u^{l_2}\Vert _{L^1( \Omega )}\,, \end{aligned}$$

which together with (3.32) and (3.33) implies that \((u^l)\) is a Cauchy sequence. Thus, there exists \(u\in L^1(\Omega )\) such that

$$\begin{aligned} u^l \rightarrow u \ \ \mathrm{in} \ \ L^1(\Omega ). \end{aligned}$$
(3.34)

Now it is not difficult to see that the entire sequence \((u_n)\) converges toward u in \(L^1(\Omega )\) as well. Namely,

$$\begin{aligned} \Vert u_n-u\Vert _{L^1(\Omega )} \le \Vert u_n-T_l(u_n)\Vert _{L^1(\Omega )}+\Vert T_l(u_n)-u^l\Vert _{L^1(\Omega )}+\Vert u^l-u\Vert _{L^1(\Omega )}, \end{aligned}$$

which by the definition of the functions \(u^l\), in conjunction with (3.33) and (3.34) gives the claim. \(\square \)

We actually proved only item (i) of Theorem 3.1. The other two items (item (ii) and item (iii)) can be proven by an adaptation of the proof for item (i). We provide a more precise explanation in the following remark.

Remark 3.9

Derivation of (ii) and (iii) from Theorem 3.1.

If we additionally assume that \(\Vert \partial _\lambda {{\mathfrak {f}}}\Vert \le C <\infty \), then we can use (3.8) in the case \(\delta ={{{\mathcal {O}}}}(\varepsilon ^2)\) and \(\frac{\sqrt{\delta }}{\sqrt{\varepsilon } n(\varepsilon )^{d/2+2}}\rightarrow 0\) and the considerations thereafter will remain the same. This gives (ii). Indeed, we have to check whether the sequence of equations (3.30) satisfies the conditions of Theorem 3.3. To this end, we need to estimate \(\Gamma ^\varepsilon _j\), \(j=1,\ldots ,4\), appearing in (3.20). If we assume \(\Vert \partial _\lambda {{\mathfrak {f}}}\Vert \le C <\infty \), then we can omit (3.25) and get instead of (3.27)

$$\begin{aligned} \left| \left\langle \mathrm {div}\Gamma _{2}^{\varepsilon },\phi \right\rangle \right|&\le C \delta ^{1/2} \left( \frac{1}{\sqrt{\varepsilon } } + \Big (\frac{\sqrt{\delta }}{\sqrt{\varepsilon }}+\sqrt{\varepsilon }\Big )\frac{1}{n(\varepsilon )} + \frac{1}{\sqrt{\varepsilon }} +\frac{1}{n(\varepsilon )^{d/2 +1}} \right) \\&\quad \cdot \Vert \phi \Vert _{H_{0}^{1}([0,T]\times {\mathbb {R}}^{d+1})} \rightarrow 0^{+},as\varepsilon \rightarrow 0^{+}. \nonumber \end{aligned}$$
(3.35)

Similarly, instead of (3.28), we have

$$\begin{aligned} \begin{aligned} \left| \left\langle \Gamma _{4}^{\varepsilon },\phi \right\rangle \right|&\le \delta \left| \sum _{j=1}^{d}\int \limits _{0}^{T} \int \limits _{[-M,M]^{d+1}}|u_{x_j}^{\varepsilon }| \, |u_{t,x_{j}}^{\varepsilon }| \, |\phi (t,\mathbf{x},\lambda )|\mathrm{d} \mathbf{x}\mathrm{d}t \mathrm{d}\lambda \right| \\&\leqslant C \sqrt{\delta } \Vert \phi \Vert _\infty \Big (\frac{1}{\sqrt{\varepsilon }} + \frac{\sqrt{\delta }}{\sqrt{\varepsilon } n(\varepsilon )}\Big ) \Big [\frac{1}{\sqrt{\varepsilon }}+ \\&\quad +\,\Big (\frac{\sqrt{\delta }}{\sqrt{\varepsilon }} +\sqrt{\varepsilon }\Big )\frac{1}{n(\varepsilon )} + \frac{1}{\sqrt{\varepsilon }}+ \frac{1}{n(\varepsilon )^{d/2 +1}} \Big ]\le C \Vert \phi \Vert _\infty \end{aligned} \end{aligned}$$
(3.36)

If we have a regular flux, i.e., \({{\mathfrak {f}}}\in C^1({\mathbb {R}}^d\times {\mathbb {R}})\) and the conditions \(\delta =o(\varepsilon ^2)\) and \(\frac{\sqrt{\delta }}{\sqrt{\varepsilon } n(\varepsilon )^{d/2+2}}\rightarrow 0\) as \(\varepsilon \rightarrow 0\), we can use Young measures in the way given in [31] to derive the convergence. Actually, in this case the diffusion given by the second-order term in (1.7) will dominate over the dynamic capillarity given by the third-order term in (1.7), and therefore, we will end up with the unique Kruzhkov admissible solution to (1.6) with the corresponding initial data.

Indeed, it is not difficult to see that under the regularity assumptions on \({{\mathfrak {f}}}\) and \(\delta =o(\varepsilon ^2)\), we have

$$\begin{aligned} \begin{aligned}&-\int \limits _{{\mathbb {R}}^+\times {\mathbb {R}}^d \times {\mathbb {R}}} \left( \Gamma _{1}^{\varepsilon }+\Gamma _{2}^{\varepsilon }\right) \cdot \nabla \phi (t,\mathbf{x}) \eta ''(\lambda )\mathrm{d}t \mathrm{d}\mathbf{x}\mathrm{d}\lambda \rightarrow 0\\&-\int \limits _{{\mathbb {R}}^+\times {\mathbb {R}}^d \times {\mathbb {R}}} \Gamma _{4}^{\varepsilon }(t,\mathbf{x},\lambda ) \phi (t,\mathbf{x}) \eta '''(\lambda ) \mathrm{d}t \mathrm{d}\mathbf{x}\mathrm{d}\lambda \rightarrow 0. \end{aligned} \end{aligned}$$
(3.37)

Now, denote by \(\nu _{(t,\mathbf{x})}\) the Young measure (e.g., [18]) corresponding to a subsequence of the family \((u^\varepsilon )\). If we let \(\varepsilon \rightarrow 0\) in (3.18) along the subsequence for a convex entropy \(\eta \), and take (3.37) into account, we get in the sense of distributions (below, \(q(\mathbf{x},\lambda )=\int \limits _{-\infty }^\lambda \eta '(v) \partial _v {{\mathfrak {f}}}(\mathbf{x},v) \mathrm{d}v\) is the entropy flux)

$$\begin{aligned} \partial _{t} \int \eta (\lambda )\mathrm{d}\nu _{(t,\mathbf{x})}(\lambda ) +\mathrm {div}\int q(\mathbf{x},\lambda ) \mathrm{d}\nu _{(t,\mathbf{x})}(\lambda )+\int \int \limits _{-\infty }^{\lambda } \eta ''(v)\mathrm {div}{{\mathfrak {f}}}(\mathbf{x},v) \mathrm{d}v \mathrm{d}\nu _{(t,\mathbf{x})}(\lambda ) \le 0. \end{aligned}$$

Now, we simply rely on the result from [15] (see also [32, 49]) to conclude that the Young measure is unique and atomic, i.e., of the form \(\nu _{(t,\mathbf{x})}(\lambda )=\delta (\lambda -u(t,\mathbf{x}))\) for the unique entropy admissible solution to the underlying conservation law (1.6).