1 Introduction

We consider the quasi-variational inequality (QVI)

$$\begin{aligned} \text {Find } y \in Q(y) \quad \text {such that}\quad \langle A(y) - f ,v - y\rangle \ge 0 \qquad \forall v \in Q(y) . \end{aligned}$$
(1.1)

Here, V is a Hilbert space, \(A :V \rightarrow V^\star \) is a (possibly nonlinear) mapping, and \(f \in V^\star \). We will not cover the general situation of a set-valued mapping \(Q :V \rightrightarrows V\), but we restrict the treatment of (1.1) to the case in which Q(y) is a moving set, i.e.,

$$\begin{aligned} Q(y) = K + \varPhi (y) \end{aligned}$$
(1.2)

for some non-empty, closed and convex subset \(K \subset V\) and \(\varPhi :V \rightarrow V\). It is well-known that QVIs have many important real-world applications, we refer exemplarily to [1, 3, 5, 16] and the references therein.

The main contributions of this paper are the following.

  • We prove existence and uniqueness of solutions to (1.1) under a smallness assumption on the mapping \(\varPhi \), see Sect. 3.

  • If, additionally, the functions A and \(\varPhi \) are differentiable and if K is polyhedric, we establish the directional differentiability of the solution mapping of (1.1), see Sect. 5.

  • For the associated optimal control problem, we derive necessary optimality conditions of strongly stationary type, see Sect. 6.

In particular, our results are applicable if the Lipschitz constant of \(\varPhi \) is small. Let us put our work in perspective. In the following discussion, we will assume that A is \(\mu _A\)-strongly monotone and \(L_A\)-Lipschitz and that \(\varPhi \) is \(L_\varPhi \)-Lipschitz. We refer to Sect. 2 for the definitions. We further define the condition number of A via \(\gamma _A := L_A / \mu _A \ge 1\). An existence and uniqueness result for the general QVI (1.1) was given in [15, Theorem 9]. This result can be applied to the moving set case (1.2) via [14, Lemma 3.2]. One obtains the unique solvability of (1.1) under the condition

$$\begin{aligned} L_\varPhi < 1 - \sqrt{1 - 1/\gamma _A^2} = \frac{1}{\gamma _A \, \Big (\gamma _A + \sqrt{\gamma _A^2-1}\Big )}. \end{aligned}$$
(1.3)

In the work [14, Corollary 2] the requirement was relaxed to

$$\begin{aligned} L_\varPhi < \frac{1}{\gamma _A} . \end{aligned}$$
(1.4)

In this work, we shall show that

$$\begin{aligned} L_\varPhi < \frac{2 \, \sqrt{\gamma _A}}{1 + \gamma _A} \end{aligned}$$
(1.5)

is sufficient for existence and uniqueness under the condition that A is the derivative of a convex function. Note that A is indeed a derivative of a convex function in many important applications. Moreover, the conditions (1.4) and (1.5) are necessary for uniqueness in the following sense: Whenever the constants \(L_\varPhi < 1\), \(0 < \mu _A \le L_A\) violate (1.4) with \(\gamma _A := L_A / \mu _A\), there exist bounded and linear operators \(A :V \rightarrow V^\star \) and \(\varPhi :V \rightarrow V\) possessing these constants such that (1.1) does not have a unique solution for every \(f \in V^\star \). If even (1.5) is violated, A can chosen to be symmetric. We refer to Theorems 3.6 and 3.7 below for the precise formulation of this result.

For a different approach to obtain uniqueness of solutions to (1.1), we refer to [10].

To our knowledge, [1] is the only contribution concerning differentiability of the solution mapping of (1.1). Their approach is based on monotonicity considerations and only the differentiability into non-negative directions is obtained. In what follows, we are able to relax the assumption required for the differentiability and we also obtain differentiability in all directions, see Theorem 5.5.

Finally, we are not aware of any contribution in which stationarity conditions for the optimal control of (1.1) are obtained.

2 Notation and preliminaries

Throughout this work, V will denote a Hilbert space. Its dual space is denoted by \(V^\star \). The radial cone, the tangent cone and the normal cone of a closed, convex set \(K \subset V\) at \(y \in K\) are given by

$$\begin{aligned} \mathcal {R}_K(y)&:= {\text {cone}}(K - y) = \bigcup _{\alpha > 0} \alpha \, (K-y) , \qquad \mathcal {T}_K(y) := {\text {cl}}\{\mathcal {R}_K(y)\},\\ \mathcal {T}_K(y)^\circ&:= \big \{\lambda \in V^\star \big |\langle \lambda ,v - y\rangle \le 0 \; \forall v \in K\big \}, \end{aligned}$$

respectively. The critical cone of K w.r.t. \((y,\lambda ) \in K \times \mathcal {T}_K(y)^\circ \) is given by

$$\begin{aligned} \mathcal {K}_K(y, \lambda ) := \mathcal {T}_K(y) \cap \lambda ^\perp = \big \{v \in \mathcal {T}_K(y) \big |\langle \lambda ,v\rangle = 0 \big \} . \end{aligned}$$

The set K is called polyhedric at \((y,\lambda )\) if \(\mathcal {K}_K(y,\lambda ) = {\text {cl}}\{\mathcal {R}_K(y) \cap \lambda ^\perp \}\). We refer to [17] for a recent review of polyhedricity.

A mapping \(B:V \rightarrow V^\star \) is called monotone if

$$\begin{aligned} \langle B(y_1) - B(y_2),y_1 - y_2\rangle \ge 0 \qquad \forall y_1,y_2 \in V. \end{aligned}$$

It is called \(\mu \)-strongly monotone if \(\mu > 0\) satisfies

$$\begin{aligned} \langle B(y_1) - B(y_2),y_1 - y_2\rangle \ge \mu \, \Vert y_1 - y_2\Vert _V^2 \qquad \forall y_1,y_2 \in V. \end{aligned}$$

If H is another Hilbert space, a mapping \(C:V \rightarrow H\) is called L-Lipschitz for some \(L \ge 0\) if

$$\begin{aligned} \Vert C(y_1) - C(y_2)\Vert _H \le L \, \Vert y_1 - y_2\Vert _V \qquad \forall y_1,y_2 \in V. \end{aligned}$$

The monotonicity of an operator implies some weak lower semicontinuity.

Lemma 2.1

Let \(A :V \rightarrow V^\star \) be a monotone operator. Suppose that \(y_n \rightharpoonup y\) in V and \(A(y_n) \rightharpoonup A(y)\) in \(V^\star \). Then,

$$\begin{aligned} \liminf _{n \rightarrow \infty } \langle A(y_n),y_n\rangle \ge \langle A(y),y\rangle . \end{aligned}$$

Proof

From the monotonicity of A we find

$$\begin{aligned} \langle A(y_n),y_n\rangle \ge \langle A(y_n),y \rangle + \langle A(y ),y_n\rangle - \langle A(y ),y\rangle . \end{aligned}$$

The right-hand side converges towards \(\langle A(y),y\rangle \) due to the weak convergences \(y_n \rightharpoonup y\) and \(A(y_n) \rightharpoonup A(y)\). This implies the claim. \(\square \)

In the case that A is additionally bounded and linear, the above claim can be obtained from the observation that \(y \mapsto \langle A(y),y\rangle \) is convex. This convexity does not hold in the nonlinear setting: consider \(A :\mathbb {R}\rightarrow \mathbb {R}\), \(y \mapsto \max (-\,1,\min (1,y))\).

In order to obtain unique solvability of (1.1) via contraction-type arguments, one typically requires an inequality like

$$\begin{aligned} \Vert {\text {Proj}}_{Q(x)}(z) - {\text {Proj}}_{Q(y)}(z) \Vert _V \le L_Q \, \Vert x - y\Vert _V \qquad \forall x,y,z \in V \end{aligned}$$
(2.1)

for some \(L_Q \ge 0\), see, e.g., [14, Theorem 4.1]. Note that this inequality is not related to the Lipschitz continuity of the projection since the arguments of the projections in (2.1) coincide. By means of an example, we show that (2.1) does not hold for obstacle-type problems if Q is not of the moving-set type. We consider the setting \(\varOmega = (0,1)\), \(V = H_0^1(\varOmega )\)\(f = 1\), \(A = -\,\varDelta \) and

$$\begin{aligned} K(h) := \{ v \in V \big |v \le h \} \end{aligned}$$

for \(\mathbb {R}\ni h \ge 0\). It is easy to check that the projection of \(A^{-1} f\) onto the set K(h) is given by

$$\begin{aligned} y_h(x) = {\left\{ \begin{array}{ll} t_h \, x - \frac{1}{2} \, x^2 &{} \text {for } x \le t_h \\ h &{} \text {for } t_h< x < 1-t_h \\ -t_h \, x - \frac{1}{2} \, x^2 &{} \text {for } x \ge 1-t_h \end{array}\right. } \end{aligned}$$

with \(t_h := \sqrt{2\,h}\) for all \(h \in [0,1/8]\). Here, we used the norm \(\Vert v\Vert _{H_0^1}^2 = \int _\varOmega |\nabla v|^2\,{\mathrm {d}}x\). Then, \(\Vert y_h\Vert _{H_0^1(\varOmega )} = C \, h^{3/4}\) for some \(C > 0\). Since \(y_0 = 0\), the mapping \(h \mapsto y_h\) is not Lipschitz at \(h = 0\). By choosing a suitable \(\varPsi :V \rightarrow \mathbb {R}\) it can be checked that \(Q = K \circ \varPsi \) violates (2.1).

3 Moving-set QVIs as VIs

In this section, we utilize the moving-set structure of Q(y) to recast the QVI (1.1) as an equivalent variational inequality (VI). This is a classical approach, see also [1, Section 2], [2, Section 5.1]. We start by defining the new solution variable \(z := y - \varPhi (y) \in K\). In order to not lose any information, we require that the function \(I - \varPhi :V \rightarrow V\) is a bijection. Hence, \(y = (I - \varPhi )^{-1}(z)\). Now, it is easy to check that (1.1) is equivalent to

$$\begin{aligned} \text {Find } z \in K \quad \text {such that}\quad \big \langle (A \circ (I-\varPhi )^{-1}) (z) - f ,v - z\big \rangle \ge 0 \qquad \forall v \in K \end{aligned}$$
(3.1)

and \(y = (I - \varPhi )^{-1}(z)\). This means the following: Under the assumption that \(I-\varPhi \) is a bijection, y is a solution of (1.1) if and only if \(z = (I - \varPhi )(y)\) is a solution of (3.1). In what follows, we are going to use the VI (3.1) in order to obtain information about the QVI (1.1). In the case in which both A and \(\varPhi \) are linear, such a strategy was suggested in [1, Remark 7]. We shall see that this is also viable in the fully nonlinear case.

In order to analyze (3.1) we will frequently make use of the following assumption.

Assumption 3.1

The operator \(I - \varPhi :V \rightarrow V\) is a bijection and the operator \(B := A \circ (I-\varPhi )^{-1}\) is strongly monotone and Lipschitz continuous.

Using the equivalence of (1.1) and (3.1) as well as the existence result [11, Corollary III.1.8] for (3.1), we obtain the following existence and uniqueness result for (1.1).

Theorem 3.2

Under Assumption 3.1, the QVI (1.1) has a unique solution \(y \in V\) for any \(f \in V^\star \). Moreover, the mapping \(f \mapsto y\) is Lipschitz continuous.

In the remainder of this section, we give some conditions implying Assumption 3.1.

Lemma 3.3

We assume that A is \(\mu _A\)-strongly monotone and \(L_A\)-Lipschitz and that \(\varPhi \) is \(L_\varPhi \)-Lipschitz. We further assume that

$$\begin{aligned} L_\varPhi < \frac{1}{\gamma _A}, \end{aligned}$$
(3.2)

where \(\gamma _A = L_A / \mu _A\). Then, the operator \(B := A \circ (I - \varPhi )^{-1}\) is \(\mu _B\)-strongly monotone and \(L_B\)-Lipschitz with

$$\begin{aligned} \mu _B = \frac{\mu _A - L_A \, L_\varPhi }{(1 + L_\varPhi )^2} \qquad \text {and}\qquad L_B = \frac{L_A}{1 - L_\varPhi } . \end{aligned}$$

In particular, Assumption 3.1 is satisfied and (1.1) has a unique solution for every \(f \in V^\star \).

Proof

First we remark that we have \(L_\varPhi < \gamma _A^{-1} \le 1\). Thus, Banach’s fixed point theorem implies that \(I-\varPhi \) is a bijection. We claim that \((1-L_\varPhi )^{-1}\) is a Lipschitz constant of \((I-\varPhi )^{-1}\). Indeed, let \(y_1, y_2 \in V\) be arbitrary. We define \(x_i := (I-\varPhi )^{-1}(y_i)\) for \(i = 1,2\). Then, \(x_i - y_i = \varPhi (x_i)\), \(i = 1,2\) and this yields

$$\begin{aligned} \Vert x_1 - x_2\Vert _V - \Vert y_1 - y_2\Vert _V \le \Vert (x_1 - y_1) - (x_2 - y_2) \Vert _V \le L_\varPhi \, \Vert x_1 - x_2\Vert _V . \end{aligned}$$

This shows the claim concerning a Lipschitz constant of \((I-\varPhi )^{-1}\). Moreover, this directly shows that \(L_B\) is a Lipschitz constant of B.

For arbitrary \(y_1, y_2 \in V\) we again use \(x_i := (I-\varPhi )^{-1}(y_i)\) for \(i = 1,2\). Then,

$$\begin{aligned} \big \langle B(y_1) - B(y_2),y_1 - y_2\big \rangle&= \big \langle A (x_1) - A (x_2),(I-\varPhi )(x_1) - (I-\varPhi )(x_2)\big \rangle \\&\ge (\mu _A - L_A \, L_\varPhi ) \, \Vert x_1 - x_2\Vert _V^2 . \end{aligned}$$

The estimate

$$\begin{aligned} \Vert y_1 - y_2 \Vert _V = \big \Vert x_1 - x_2 - \big (\varPhi (x_1) - \varPhi (x_2)\big ) \big \Vert _V \le (1 + L_\varPhi ) \, \Vert x_1 - x_2\Vert _V \end{aligned}$$

yields the assertion concerning the strong monotonicity of B. The final claim follows from Theorem 3.2. \(\square \)

We recall that the condition (3.2) was used in [14, Corollary 2] to obtain existence and uniqueness for solutions of (3.1). The above analysis shows, that this condition even implies Assumption 3.1.

Next, we show that the estimate (3.2) can be significantly relaxed if A is the derivative of a convex function. To this end, we need to recall an important inequality for convex functions. This inequality is well-known in the finite-dimensional case, see, e.g., [13, Theorem 2.1.12] or [6, Lemma 3.10], and the proof carries over to arbitrary Hilbert spaces. We are, however, not aware of a reference in the infinite-dimensional case.

Lemma 3.4

Let \(g:V \rightarrow \mathbb {R}\) be a Fréchet differentiable convex function such that the derivative \(g' :V \rightarrow V^\star \) is \(\mu _g\)-strongly monotone and \(L_g\)-Lipschitz. Then,

$$\begin{aligned} \langle g'(x_1) - g'(x_2),x_1 - x_2\rangle \ge \frac{\mu _g \, L_g}{\mu _g + L_g} \, \Vert x_1 - x_2\Vert _V^2 + \frac{1}{\mu _g + L_g} \, \Vert g'(x_1) - g'(x_2)\Vert _{V^\star }^2 \end{aligned}$$

holds for all \(x_1, x_2 \in V\).

Proof

One can transfer the proofs of [13, Theorem 2.1.12] or [6, Lemma 3.10] to the infinite-dimensional case by using [4, Theorem 18.15]. \(\square \)

Lemma 3.5

We assume that A is \(\mu _A\)-strongly monotone and \(L_A\)-Lipschitz and that \(\varPhi \) is \(L_\varPhi \)-Lipschitz. We further assume that there exists a Fréchet differentiable convex function \(g:V \rightarrow \mathbb {R}\) such that \(A = g'\) and

$$\begin{aligned} L_\varPhi < \frac{2 \, \sqrt{\gamma _A}}{1 + \gamma _A} = \frac{2 \, \sqrt{\mu _A \, L_A}}{\mu _A + L_A} \end{aligned}$$
(3.3)

where \(\gamma _A = L_A / \mu _A\). Then, the operator \(B := A \, (I - \varPhi )^{-1}\) is \(\mu _B\)-strongly monotone and \(L_B\)-Lipschitz with

$$\begin{aligned} \mu _B = \frac{4 \, \mu _a \, L_A - L_\varPhi ^2 \, (\mu _A + L_A)^2}{4 \, (\mu _A + L_A) \, (1+L_\varPhi )^2} \qquad \text {and}\qquad L_B = \frac{L_A}{1 - L_\varPhi } . \end{aligned}$$

In particular, Assumption 3.1 is satisfied and (1.1) has a unique solution for every \(f \in V^\star \).

Proof

By arguing as in the proof of Lemma 3.3, we obtain that \(I-\varPhi \) is invertible and the value of the Lipschitz constant \(L_B\) follows.

Now, let \(y_1, y_2 \in V\) be arbitrary and we set \(x_i := (I-\varPhi )^{-1}(y_i)\), \(i =1,2\). Then, we apply Lemma 3.4 to obtain

$$\begin{aligned} \big \langle B(y_1) - B(y_2),y_1 - y_2\big \rangle&= \big \langle A (x_1) - A (x_2),(I-\varPhi )(x_1) - (I-\varPhi )(x_2)\big \rangle \\&\ge \frac{\mu _A \, L_A}{\mu _A + L_A} \, \Vert x_1 - x_2\Vert _V^2 \\&\qquad + \frac{1}{\mu _A + L_A} \, \Vert A(x_1) - A(x_2)\Vert _{V^\star }^2 \\&\qquad - L_\varPhi \, \Vert x_1 - x_2\Vert _V \, \Vert A(x_1) - A(x_2)\Vert _{V^\star } . \end{aligned}$$

Next, we employ Young’s inequality

$$\begin{aligned} L_\varPhi \, \Vert x_1 - x_2\Vert _V \, \Vert A(x_1) - A(x_2)\Vert _{V^\star }&\le \frac{L_\varPhi ^2 \, (\mu _A + L_A)}{4} \, \Vert x_1 - x_2\Vert _V^2 \\&\qquad + \frac{1}{\mu _A + L_A} \, \Vert A(x_1) - A(x_2)\Vert _{V^\star }^2 \end{aligned}$$

and get

$$\begin{aligned} \big \langle B(y_1) - B(y_2),y_1 - y_2\big \rangle&\ge \Big (\frac{\mu _A \, L_A}{\mu _A + L_A} - \frac{L_\varPhi ^2 \, (\mu _A + L_A)}{4}\Big ) \, \Vert x_1 - x_2\Vert _V^2 \\&= \frac{4 \, \mu _A \, L_A - L_\varPhi ^2 \, (\mu _A + L_A)^2}{4 \, (\mu _A + L_A)} \, \Vert x_1 - x_2\Vert _V^2 . \end{aligned}$$

From the proof of Lemma 3.3 we find \(\Vert y_1 - y_2 \Vert _V \le (1 + L_\varPhi ) \, \Vert x_1 - x_2\Vert _V\) and this yields the monotonicity. The final claim follows from Theorem 3.2. \(\square \)

Note that the inequality (3.3) is weaker than (3.2), unless \(\gamma _A = 1\). Lemma 3.5 is an improvement of the corresponding results in the literature, e.g., [14, Corollary 2], in the case that A is the derivative of a convex function. It is well known that A is a derivative of a convex function if and only if A is maximally cyclically monotone, see, e.g., [4, Theorem 22.14].

Finally, we demonstrate by the mean of two examples that the assumptions (3.2) and (3.3) are sharp, even in the case of linear operators. These examples are found by constructing operators for which the estimates in the proofs of Lemmas 3.3 and 3.5 are sharp. First, we validate the sharpness of (3.3).

Theorem 3.6

Let the constants \(0< \mu _A < L_A\) be given. We define \(L_\varPhi := \mu _A / L_A < 1\), i.e., (3.2) is violated. Then, there exist linear operators A and \(\varPhi \) on \(V = \mathbb {R}^2\) (equipped with the Euclidean inner product), such that A is \(\mu _A\)-strongly monotone and \(L_A\)-Lipschitz, \(\varPhi \) is \(L_\varPhi \)-Lipschitz and \(A \, (I-\varPhi )^{-1}\) is not coercive. Moreover, there exists a one-dimensional subspace \(K \subset \mathbb {R}^2\) such that (3.1) and (1.1) are not uniquely solvable for all \(f \in V^\star \).

Proof

We define the constant \(c_A := \sqrt{L_A^2 - \mu _A^2} > 0\) and the operators

$$\begin{aligned} A := \begin{pmatrix} \mu _A &{} -c_A \\ c_A &{} \mu _A \end{pmatrix} , \qquad \varPhi := \frac{L_\varPhi }{L_A} \, y \, x^\top \end{aligned}$$

where

$$\begin{aligned} x = \begin{pmatrix} 1 \\ 0 \end{pmatrix} , \qquad y = \begin{pmatrix} \mu _A \\ c_A \end{pmatrix} = A \, x . \end{aligned}$$

Since A is the combination of a rotation and a scaling by \(L_A\), it is easy to check that \(z^\top A \, z = \mu _A \, \Vert z\Vert ^2\) and \(\Vert A \, z\Vert = L_A \, \Vert z\Vert \) hold for all \(z \in \mathbb {R}^2\). Moreover, the Lipschitz constant of \(\varPhi \) is \(L_\varPhi \). However,

$$\begin{aligned} z^\top A \, (I - \varPhi )^{-1} \, z = 0, \qquad \text {where}\qquad z = (I-\varPhi ) \, x \ne 0 . \end{aligned}$$

Hence, \(A \, (I-\varPhi )^{-1}\) is not coercive. Moreover, if we set \(K = {\text {span}}\{z\}\) it is clear that (3.1) is not uniquely solvable for all \(f \in V^\star = \mathbb {R}^2\). Since \(I - \varPhi \) is a bijection, this implies that (1.1) is not uniquely solvable for all \(f \in V^\star = \mathbb {R}^2\). \(\square \)

The next result shows that (3.3) is sharp.

Theorem 3.7

Let \(0< \mu _A < L_A\) be given. We define \(L_\varPhi := 2 \, \sqrt{\mu _A \, L_A} / (\mu _A + L_A) < 1\), i.e., (3.3) is violated. Then, there exists a linear symmetric operator A on \(V = \mathbb {R}^2\) (equipped with the Euclidean inner product) and a linear operator \(\varPhi \) in \(\mathbb {R}^2\), such that A is \(\mu _A\)-strongly monotone and \(L_A\)-Lipschitz, \(\varPhi \) is \(L_\varPhi \)-Lipschitz and \(A \, (I-\varPhi )^{-1}\) is not coercive. Moreover, there exists a one-dimensional subspace \(K \subset \mathbb {R}^2\) such that (3.1) and (1.1) cannot be uniquely solvable for all \(f \in V^\star \).

Proof

We define

$$\begin{aligned} A := \begin{pmatrix} \mu _A &{}\quad 0 \\ 0 &{}\quad L_A \end{pmatrix} . \end{aligned}$$

It is clear that the operator A is \(\mu _A\)-strongly monotone and \(L_A\)-Lipschitz. We further set

$$\begin{aligned} x := \begin{pmatrix} \sqrt{L_A / (\mu _A + L_A)} \\ \sqrt{\mu _A / (\mu _A + L_A)} \end{pmatrix} , \quad \varPhi := \frac{2}{(\mu _A + L_A)^2} \, \begin{pmatrix} \mu _A^2 \, L_A &{} \mu _A \, \sqrt{\mu _A \, L_A} \\ L_A \, \sqrt{\mu _A \, L_A} &{} \mu _A^2 \, L_A \end{pmatrix} . \end{aligned}$$

It can be checked that \(\varPhi \) is \(L_\varPhi \)-Lipschitz and \(\Vert x\Vert = 1\). However,

$$\begin{aligned} x^\top A \, (I - \varPhi ) \, x = 0 \qquad \text {and}\qquad y^\top A \, (I - \varPhi )^{-1} \, y = 0 \end{aligned}$$

where \(y = (I-\varPhi ) \, x \ne 0\). Hence, \(A \, (I-\varPhi )\) and \(A \, (I-\varPhi )^{-1}\) are not coercive. Moreover, if we set \(K = {\text {span}}\{y\}\) it is clear that (3.1) cannot be uniquely solvable for all \(f \in V^\star = \mathbb {R}^2\). Since \(I - \varPhi \) is a bijection, this implies that (1.1) is not uniquely solvable for all \(f \in V^\star = \mathbb {R}^2\). \(\square \)

We further mention that it is also possible to obtain Assumption 3.1 in situations in which \(\varPhi \) is “not small”, e.g., if \(\varPhi = \lambda \, I\) with some \(\lambda < 1\), Assumption 3.1 follows automatically if A is strongly monotone and Lipschitz since \((I-\varPhi )^{-1} = (1-\lambda )^{-1} \, I\) in this case. Moreover, it is possible to analyze the situation in which A is a small perturbation of the derivative of a convex function by combining the ideas of Lemmas 3.3 and 3.5.

The combination of Theorem 3.2 and 3.3 yields a well-known result: under the assumption (3.2), the QVI (1.1) has a unique solution. Such a result is typically shown via contraction-type arguments, see, e.g., [14] or [2, Section 3.1.1]. Thus, the approach of this section is able to reproduce this classical result. However, the combination of Theorem 3.2 and Lemma 3.5 yields a new result in case that A has a convex potential in which the condition (3.2) on the Lipschitz constant \(L_\varPhi \) of \(\varPhi \) is relaxed to (3.3).

4 Localization of the smallness assumption

We localize the assumptions concerning the Lipschitz constant of \(\varPhi \).

Assumption 4.1

We assume that \(A :V \rightarrow V^\star \) is (globally) \(\mu _A\)-strongly monotone and \(L_A\)-Lipschitz. Further, let \({\bar{f}} \in V^\star \) be given and let \({\bar{y}}\) be a solution of (1.1). We suppose that there is a closed, convex neighborhood \(Y \subset V\) of \({\bar{y}}\) such that \(\varPhi \) is \(L_\varPhi \)-Lipschitz continuous on Y. Finally,

  1. (i)

    inequality (3.2) holds or

  2. (ii)

    inequality (3.3) holds and A is the Fréchet derivative of a convex function.

Theorem 4.2

Suppose that Assumption 4.1 is satisfied. There is a neighborhood \(F \subset V^\star \) of f such that (1.1) has exactly one solution in Y for all \(f \in F\). Moreover, this solution depends Lipschitz-continuously on f.

Note that we do not claim that (1.1) is uniquely solvable for all \(f \in F\) and (1.1) might have further solutions in \(V {\setminus } Y\).

Proof

We define \({\tilde{\varPhi }} :V \rightarrow V\) via

$$\begin{aligned} {\tilde{\varPhi }}(y) := \varPhi \big ({\text {Proj}}_Y(y)\big ). \end{aligned}$$

Since projections are 1-Lipschitz, \({\tilde{\varPhi }}\) is \(L_\varPhi \)-Lipschitz. Now, we consider the modified QVI

$$\begin{aligned} \text {Find } y \in {\tilde{Q}}(y) \quad \text {such that}\quad \langle A(y) - f ,v - y\rangle \ge 0 \qquad \forall v \in {\tilde{Q}}(y) \end{aligned}$$
(4.1)

with

$$\begin{aligned} {\tilde{Q}}(y) = K + {\tilde{\varPhi }}(y) . \end{aligned}$$

From Assumption 4.1, Lemmas 3.3 and 3.5, and Theorem 3.2 it follows that (4.1) has a unique solution \(y = \tilde{S}(f)\) for every \(f \in F\) and the solution operator \(S :V^\star \rightarrow V\) is Lipschitz continuous. Hence, we can choose a neighborhood \(F \subset V^\star \) of \({\bar{f}}\), such that \(\tilde{S}(f) \in Y\) for all \(f \in F\).

Since \(Q(y) = {\tilde{Q}}(y)\) for all \(y \in Y\), it is clear that \(y \in Y\) is a solution of (1.1) if and only if \(y \in Y\) solves (4.1). Hence, (1.1) has a unique solution in Y for all \(f \in F\). \(\square \)

5 Differential stability

In this section, we consider the situation of Assumption 4.1. However, we do not need Assumption 4.1 directly, but the assertion of Theorem 4.2 is enough.

Assumption 5.1

We suppose that the following assumptions are satisfied.

  1. (i).

    We assume the existence of sets \(F \subset V^\star \), \(Y \subset V\) such that for every \(f \in F\), (1.1) has a unique solution y in Y and the solution map \(S :F \rightarrow Y\), \(f \mapsto y\) is Lipschitz continuous. For fixed \({\bar{f}} \in F\), we set \({\bar{y}} := S({\bar{f}})\). The sets F, Y are assumed to be neighborhoods of \({\bar{f}}\), \({\bar{y}}\), respectively.

  2. (ii).

    The operator \(\varPhi :V \rightarrow V\) is Lipschitz on Y, i.e., there exists \(L_\varPhi > 0\) with

    $$\begin{aligned} \Vert \varPhi ( y_1 ) - \varPhi ( y_2 ) \Vert _V \le L_\varPhi \, \Vert y_1 - y_2 \Vert _V \qquad \forall y_1, y_2 \in Y. \end{aligned}$$

    We suppose that \(I - \varPhi :Y \rightarrow Z\) is bijective with a Lipschitz continuous inverse, where \(Z := (I - \varPhi )(Y)\). Further, \(\varPhi \) is Fréchet differentiable at \({\bar{y}}\) and the bounded linear operator \(I - \varPhi '({\bar{y}})\) is bijective.

  3. (iii).

    The operator A is Fréchet differentiable at \({\bar{y}}\) and the bounded linear operator

    $$\begin{aligned} A'({\bar{y}}) \, (I - \varPhi '(\bar{y}))^{-1} \end{aligned}$$
    (5.1)

    is assumed to be coercive.

  4. (iv).

    The set K is polyhedric at \(({\bar{z}}, {\bar{f}} - A({\bar{y}}))\), where \({\bar{z}} = (I - \varPhi )({\bar{y}})\).

Due to (1.2), the last assumption is equivalent to the polyhedricity of Q(y) at \(({\bar{y}}, {\bar{f}} - A({\bar{y}}))\).

First, we show that Assumption 5.1 follows from Assumption 4.1 and from the differentiability of \(\varPhi \) and A.

Theorem 5.2

Suppose that Assumption 4.1 is satisfied. Then, Assumption 5.1 (i) holds. If \(\varPhi \) is Fréchet differentiable at \({\bar{y}}\), then Assumption 5.1 (ii) holds. If, additionally, A is Fréchet differentiable at \(\bar{y}\), then Assumption 5.1 (iii) is satisfied.

Proof

Assumption 5.1 (i) follows from Theorem 4.2.

Since \(L_\varPhi < 1\), Banach’s fixed point theorem implies that \(I - \varPhi \) is bijective with a Lipschitz continuous inverse. The invertibility of \(I - \varPhi '({\bar{y}})\) follows from the Neumann series since \(\Vert \varPhi '({\bar{y}})\Vert \le L_\varPhi < 1\).

If A is Fréchet differentiable at \({\bar{y}}\), Assumption 4.1 implies that \(A'({\bar{y}})\) is \(\mu _A\)-strongly monotone and \(L_A\)-Lipschitz. In case that (3.2) is satisfied, we can invoke Lemma 3.3 to obtain Assumption 5.1 (iii). Otherwise, A is the Fréchet derivative of a convex function. Hence, \(A'({\bar{y}})\) is symmetric since it is a second Fréchet derivative, see [7, Theorem 5.1.1]. Thus, \(A'({\bar{y}})\) is the derivative of the convex function \(v \mapsto \langle A'(\bar{y})\,v,v\rangle /2\). Therefore, we can invoke Lemma 3.5 to obtain Assumption 5.1 (iii). \(\square \)

Lemma 5.3

Let us assume that Assumption 5.1 (i)–(ii) is satisfied. Then, \((I - \varPhi )^{-1}\) is Fréchet differentiable at \({\bar{z}} := (I - \varPhi )({\bar{y}})\) and \(( (I-\varPhi )^{-1} )'({\bar{z}}) = (I-\varPhi '({\bar{y}}))^{-1}\).

Proof

For arbitrary \(h \in V\) we have

$$\begin{aligned} h = (I - \varPhi ) \big [ (I-\varPhi )^{-1} ({\bar{z}} + h) - {\bar{y}} + \bar{y}\big ] - {\bar{z}} . \end{aligned}$$

Using the Fréchet differentiability of \(\varPhi \) at \({\bar{y}}\) implies

as \(\Vert (I-\varPhi )^{-1}({\bar{z}} + h) - {\bar{y}}\Vert _V \rightarrow 0\). Finally, using the fact that \((I-\varPhi )^{-1}\) is Lipschitz implies

This shows the claim. \(\square \)

Lemma 5.4

Let us assume that Assumption 5.1 (i)–(iii) is satisfied. The operator \(B := A \circ (I - \varPhi )^{-1}\) is Fréchet differentiable at \({\bar{z}}\) and its Fréchet derivative is given by \(B'({\bar{z}})= A'({\bar{y}}) \, (I - \varPhi '(\bar{y}))^{-1}\).

Proof

Follows from Lemma 5.3 together with a chain rule. \(\square \)

Theorem 5.5

Let us assume that Assumption 5.1 is satisfied. Then, the solution map S is directionally differentiable at \({\bar{f}}\) and the directional derivative \(x := S'({\bar{f}}; h)\) in direction \(h \in V^\star \) is given by the unique solution of the QVI

$$\begin{aligned} \text {Find } x \in Q^{{\bar{y}}}(x) \quad \text {such that}\quad \langle A'({\bar{y}}) \, x - h ,v - x\rangle \ge 0 \qquad \forall v \in Q^{\bar{y}}(x) , \end{aligned}$$
(5.2)

where the set-valued mapping \(Q^{{\bar{y}}} :V \rightrightarrows V\) is given by

$$\begin{aligned} Q^{{\bar{y}}}(x) := \mathcal {K}_K({\bar{z}}, {\bar{f}} - A({\bar{y}})) + \varPhi '({\bar{y}}) \, x . \end{aligned}$$

Note that we have

$$\begin{aligned} \mathcal {K}_K({\bar{z}}, {\bar{f}} - A({\bar{y}})) = \mathcal {K}_{Q({\bar{y}})}({\bar{y}}, {\bar{f}} - A({\bar{y}})) \end{aligned}$$

due to (1.2).

Proof

Let \(h \in V^\star \) be given. There exists \(T > 0\) such that \({\bar{f}} + t \, h \in F\) for all \(t \in [0,T)\). For \(t \in (0,T)\) we define

$$\begin{aligned} y_t&:= S\big ({\bar{f}} + t \, h\big ),&x_t&:= \frac{y_t - {\bar{y}}}{t} \\ z_t&:= (I - \varPhi )(y_t)&w_t&:= \frac{z_t - {\bar{z}}}{t} . \end{aligned}$$

Since S is assumed to be Lipschitz continuous on F, the difference quotients \(\{ x_t \big |t \in (0,T) \}\) are bounded in V. The Lipschitz continuity of \(\varPhi \) implies the boundedness of \(\{ w_t \big |t \in (0,T) \}\) in V.

Since \(z_t\) solves the VI (3.1), i.e.,

$$\begin{aligned} \text {Find } z \in K \quad \text {such that}\quad \big \langle B(z) - f ,v - z\big \rangle \ge 0 \qquad \forall v \in K \end{aligned}$$

with \(f := {\bar{f}} + t \, h\), we can apply [9, Theorem 2.13] to obtain the convergence of the difference quotients \(w_t\). Let us check that the assumptions of [9, Theorem 2.13] are satisfied. The standing assumption [9, Assumption 2.1] is satisfied in our Hilbert space setting with \(j = \delta _K\) being the indicator function (in the sense of convex analysis) of the set K. The validity of [9, Assumption 2.2] follows from the Taylor expansion

$$\begin{aligned} B(z_t) = B({\bar{z}} + t \, w_t) = B({\bar{z}}) + t \, B'({\bar{z}}) \, w_t + r(t) \end{aligned}$$

with , see Lemma 5.4. It remains to check that the assumption [9, Theorem 2.13 (ii)] holds:

  • Since K is assumed to be polyhedric at \({\bar{z}}\) w.r.t. \({\bar{f}} - A({\bar{y}})\), its indicator function \(\delta _K\) is twice epi-differentiable at \(({\bar{z}}, {\bar{f}} - A({\bar{y}}))\), see [9, Corollary 3.3]. Moreover, its second subderivative is the indicator function of the critical cone \(\mathcal {K}_K({\bar{z}}, {\bar{f}} - A({\bar{y}})) := \mathcal {T}_K({\bar{z}}) \cap ({\bar{f}} - A({\bar{y}}))^\perp \).

  • The weak convergence \(w_n \rightharpoonup w\) in V implies \(B'({\bar{z}}) \, w_n \rightharpoonup B'({\bar{z}}) \, w\) in \(V^\star \) and \(\liminf _{n \rightarrow \infty } \langle B'({\bar{z}}) w_n,w_n\rangle \ge \langle B'({\bar{z}})\,w,w\rangle \) follows from the coercivity of the linear operator \(B'({\bar{z}}) = A'({\bar{y}}) \, (I - \varPhi '({\bar{y}}))^{-1}\), see Lemma 2.1, Assumption 5.1 (iii) and Lemma 5.4.

Thus, the application of [9, Theorem 2.13] yields that all accumulation points w of \(w_t\) for \(t \searrow 0\) are solutions of the linearized VI

$$\begin{aligned} \begin{aligned}&\text {Find } w \in \mathcal {K}_K({\bar{z}}, {\bar{f}} - A({\bar{y}})) \\&\text {such that}\quad \big \langle B'({\bar{z}}) \, w - h ,v - w\big \rangle \ge 0 \qquad \forall v \in \mathcal {K}_K({\bar{z}}, {\bar{f}} - A({\bar{y}})) . \end{aligned} \end{aligned}$$
(5.3)

Since \(B'({\bar{z}})\) is coercive, this linearized VI has a unique solution. Hence, the last part of [9, Theorem 2.13] implies \(w_t \rightarrow w\) as \(t \searrow 0\).

It remains to prove the convergence of \(x_t\) towards the solution of (5.2). Using the differentiability of \((I - \varPhi )^{-1}\), we find

The change of variables \(w = (I - \varPhi '({\bar{y}})) \, x\) shows the equivalence of (5.2) and (5.3). Thus, x is the unique solution of (5.2). \(\square \)

Some remarks concerning Theorem 5.5 are in order.

Remark 5.6

  1. (i)

    The polyhedricity assumption Assumption 5.1 (iv) can be replaced by the strong twice epi-differentiability of the indicator function \(\delta _K\) in the sense of [9, Definition 2.9]. Under this generalized assumption, the second epi-derivative of \(\delta _K\) appears as a curvature term in the linearized inequalities (5.2) and (5.3). Note that the indicator function of the critical cone \(\mathcal {K}_K({\bar{z}}, {\bar{f}} - A({\bar{y}}))\), which appears implicitly in (5.2) and (5.3), is just the second epi-derivative of \(\delta _K\) in the case of K being polyhedric.

  2. (ii)

    We have derived the differentiability result under the assumption that \(\varPhi \) is Fréchet differentiable at \({\bar{y}}\). In the notation of [9], this translates to linearity of the operator \(A_x\). However, the inspection of the proof of [9, Theorem 2.13] entails that it is possible to replace the Fréchet differentiability of \(\varPhi \) by the following set of assumptions:

    1. (a)

      \(\varPhi \) is Bouligand differentiable at \({\bar{y}}\), i.e., there exists \(\varPhi '({\bar{y}}; \cdot ) :V \rightarrow V\) such that

    2. (b)

      For every sequence \(w_n \rightharpoonup w\) in V, we assume

      $$\begin{aligned} (I - \varPhi '({\bar{y}}; \cdot ))^{-1}(w_n)&\rightharpoonup (I - \varPhi '({\bar{y}}; \cdot ))^{-1}(w) , \end{aligned}$$
      (5.4a)
      $$\begin{aligned} \liminf _{n \rightarrow \infty } \langle A'({\bar{y}}) \, (I - \varPhi '({\bar{y}}; \cdot ))^{-1}(w_n) , w_n \rangle&\ge \langle A'({\bar{y}}) \, (I - \varPhi '({\bar{y}}; \cdot ))^{-1}(w) , w \rangle . \end{aligned}$$
      (5.4b)

    Note that (a) implies that \(\varPhi '({\bar{y}}; \cdot )\) is Lipschitz on V with constant \(L_\varPhi \). Hence, \((I - \varPhi '({\bar{y}};\cdot ))\) is invertible and Lemmas 3.3 and 3.5 can be used to obtain the strong monotonicity of \(A'({\bar{y}}) \, (I - \varPhi '({\bar{y}}; \cdot ))^{-1}\).

    Property (5.4a) can be verified by assuming, e.g., weak continuity of \(\varPhi '({\bar{y}}; \cdot )\). Indeed, the sequence \(z_n := (I - \varPhi '({\bar{y}}; \cdot ))^{-1}(w_n)\) is bounded, hence, \(z_n \rightharpoonup z\) along a subsequence. Now, weak continuity implies \(w_n = z_n - \varPhi '({\bar{y}}; z_n) \rightharpoonup z - \varPhi '({\bar{y}}; z)\) and \(w_n \rightharpoonup w\) implies \(z = (I - \varPhi '({\bar{y}}; \cdot ))^{-1}(w)\), i.e. \(z_n \rightharpoonup (I - \varPhi '({\bar{y}}; \cdot ))^{-1}(w)\) along a subsequence. The uniqueness of the limit point implies the convergence of the entire sequence.

    Finally, (5.4b) can be obtained via (5.4a) and Lemma 2.1.

In the next remark, we compare our differentiability result with [1, Theorem 1].

Remark 5.7

In [1, Theorem 1] a similar differentiability result is obtained in a more restrictive setting:

  1. (i)

    Therein, the leading operator A has to be linear and T-monotone (w.r.t. a vector space order on V). Our approach also allows for non-linear operators and we do not need any order structure on V. Similarly, we do not need any monotonicity assumptions on \(\varPhi \).

  2. (ii)

    They require the complete continuity of \(\varPhi '({\bar{y}})\), which is not needed in Theorem 5.5.

  3. (iii)

    One of their most restrictive assumptions is the assumption (A5). Via [7, Theorem 3.1.2], this assumption is equivalent to \(\varPhi \) being \(L_\varPhi \)-Lipschitz with

    $$\begin{aligned} L_\varPhi < \frac{1}{1 + \gamma _A}. \end{aligned}$$
    (5.5)

    This inequality is much stronger than (3.2). Thus, their assumption (A5) implies that the solutions to the QVI (1.1) are unique.

Moreover, they obtained the differentiability only for non-negative directions whereas our approach is applicable to arbitrary perturbations of the right-hand side.

One assumption in [1] is weaker: they only need Hadamard differentiability of \(\varPhi \). We need Fréchet differentiability (or Bouligand differentiability, see Remark 5.6).

6 Optimal control

In this section, we consider the optimal control problem

$$\begin{aligned} \begin{aligned} \text {Minimize} \quad&J(y,u) \\ \text {w.r.t.}\quad&y \in V, u \in U \\ \text {s.t.}\quad&y \in Q(y) \quad \text {and}\quad \langle A(y) - (B \, u + f) ,v - y\rangle \ge 0 \quad \forall v \in Q(y) . \end{aligned} \end{aligned}$$
(6.1)

Here, \(f \in V^\star \) is fixed, U is a Hilbert space and the bounded, linear operator \(B :U \rightarrow V^\star \) is assumed to have a dense range. Moreover, the objective \(J :V \times U \rightarrow \mathbb {R}\) is Fréchet differentiable.

The main goal of this section is the derivation of stationarity conditions for local solutions of (6.1). Since the constraints of (6.1) contain a QVI, this is a delicate issue. Using the (local) solution map S of the QVI, one can consider the reduced problem

$$\begin{aligned} \text {Minimize}\quad J(S(B\,u+f),u). \end{aligned}$$

Under the assumptions of Theorem 5.5, this reduced objective function is directionally differentiable and we obtain the stationarity condition

$$\begin{aligned} \langle J_y({\bar{y}}, {\bar{u}}),S'(B\,{\bar{u}} + f; B \, h) \rangle _{V^\star ,V} + \langle J_u({\bar{y}}, {\bar{u}}),h\rangle _{U^\star ,U} \ge 0 \qquad \forall h \in U. \end{aligned}$$

In the literature, such an inequality is called B-stationarity. In some situations, it is possible to introduce dual variables to obtain a so-called system of strong stationarity which is equivalent to B-stationarity. The next theorem shows that this is indeed possible for (6.1).

Theorem 6.1

Suppose that \(({\bar{y}}, {\bar{u}})\) is locally optimal for (6.1). In addition to the assumptions on B and J, we assume that Assumption 5.1 is satisfied at \({\bar{f}} := B \, {\bar{u}} + f\). Then, there exist unique multipliers \(p \in V\), \(\mu \in V^\star \) such that the system

$$\begin{aligned}&\displaystyle J_y(\cdot ) + A'({\bar{y}})^\star \, p + (I - \varPhi '({\bar{y}}))^\star \mu = 0,&\end{aligned}$$
(6.2a)
$$\begin{aligned}&\displaystyle J_u(\cdot ) -B^\star \, p = 0,&\end{aligned}$$
(6.2b)
$$\begin{aligned}&\displaystyle p \in -\mathcal {K}_K({\bar{z}}, {\bar{\lambda }}),&\end{aligned}$$
(6.2c)
$$\begin{aligned}&\displaystyle \mu \in \mathcal {K}_K({\bar{z}}, {\bar{\lambda }})^\circ&\end{aligned}$$
(6.2d)

is satisfied. Here,

$$\begin{aligned} {\bar{z}} = (I - \varPhi )({\bar{y}}) \in K \qquad \text {and}\qquad {\bar{\lambda }} = B \, {\bar{u}} + f - A({\bar{y}}) \in \mathcal {T}_K({\bar{z}})^\circ , \end{aligned}$$
(6.3)

and \(J_y(\cdot ) \in V^\star \) and \(J_u(\cdot ) \in U^\star \) are the partial Fréchet derivatives of J at \(({\bar{u}}, \bar{y})\).

Proof

We use classical arguments dating back to [12, Proposition 4.1], see also [17, Theorem 5.3].

Due to Assumption 5.1 we can invoke Theorem 5.5 to obtain the directional differentiability of the control-to-state map. Combined with the local optimality of \(({\bar{y}}, {\bar{u}})\), this implies

$$\begin{aligned} \langle J_y(\cdot ),S'(B\,{\bar{u}} + f; B \, h) \rangle _{V^\star ,V} + \langle J_u(\cdot ),h\rangle _{U^\star ,U} \ge 0 \qquad \forall h \in U. \end{aligned}$$
(6.4)

Due to the Lipschitz estimate \(\Vert S'(B\,{\bar{u}} + f; B \, h)\Vert _{V} \le C \, \Vert B \, h\Vert _{V^\star }\), the above inequality implies

$$\begin{aligned} |\langle J_u(\cdot ),h\rangle _{U^\star ,U}| \le C \, \Vert B \, h\Vert _{V^\star } \qquad \forall h \in U. \end{aligned}$$

Hence, there is \(p \in V^{\star \star }\cong V\) (by defining it as in the next line on the dense subspace \({\text {image}}(B) \subset V^\star \) and extending it by continuity on the whole space \(V^\star \)) such that

$$\begin{aligned} \langle J_u(\cdot ),h\rangle _{U^\star ,U} = \langle p,B \, h\rangle _{V,V^\star } \qquad \forall h \in U. \end{aligned}$$

This yields (6.2b) and

$$\begin{aligned} \langle J_y(\cdot ),S'(B\,{\bar{u}} + f; B \, h)\rangle _{V^\star ,V} + \langle p,B \, h\rangle _{V,V^\star } \ge 0 \qquad \forall h \in U . \end{aligned}$$

Using the density of \({\text {image}}(B)\) in \(V^\star \) we get

figure a

In what follows, we set \(\mathcal {K}:= \mathcal {K}_K({\bar{z}}, {\bar{\lambda }})\) for convenience. We recall that \(S'(B \, {\bar{u}} + f; h)\) is the unique solution of

figure b

where the set-valued mapping \(Q^{{\bar{y}}} :V \rightrightarrows V\) is given by

$$\begin{aligned} Q^{{\bar{y}}}(x) = \mathcal {K}+ \varPhi '({\bar{y}}) \, x . \end{aligned}$$

We choose \(h \in \mathcal {K}^\circ \) in (*). We check that (**) implies \(S'({\bar{u}} + f; h) = 0\). Indeed, \(0 \in Q^{{\bar{y}}}(0) = \mathcal {K}\) and

$$\begin{aligned} \langle A'({\bar{y}}) \, 0 - h ,v - 0\rangle \ge 0 \qquad \forall v \in Q^{{\bar{y}}}(0) = \mathcal {K}\end{aligned}$$

holds since \(h \in \mathcal {K}^\circ \). Thus, (*) implies

$$\begin{aligned} \langle p,h\rangle _{V,V^\star } \ge 0 \qquad \forall h \in \mathcal {K}^\circ , \end{aligned}$$

i.e., \(p \in -\mathcal {K}\), which shows (6.2c).

Now, we choose \(w \in (I - \varPhi '({\bar{y}}))^{-1} \mathcal {K}\) and set \(h = A'({\bar{y}}) \, w\). It can be checked that (**) implies \(S'(B \, {\bar{u}} + f ; h) = w\). Indeed, \(w \in Q^{{\bar{y}}}(w) = \mathcal {K}+ \varPhi '({\bar{y}})\,w\) and

$$\begin{aligned} \langle A'({\bar{y}}) \, w - h ,v - w\rangle = 0 \qquad \forall v \in Q^{\bar{y}}(w) \end{aligned}$$

due to the definition of h. With this choice, (*) implies

$$\begin{aligned} \langle J_y(\cdot ) + A'({\bar{y}})^\star \, p,w\rangle _{V^\star ,V} \ge 0 \qquad \forall w \in (I - \varPhi '({\bar{y}}))^{-1} \mathcal {K}. \end{aligned}$$

We define \(\mu := -(I - \varPhi '({\bar{y}}))^{-\star }(J_y(\cdot ) + A^\star \, p)\) and get (6.2a) and

$$\begin{aligned} \langle (I - \varPhi '({\bar{y}}))^\star \mu ,w\rangle _{V^\star ,V} \le 0 \qquad \forall w \in (I - \varPhi '({\bar{y}}))^{-1} \mathcal {K}. \end{aligned}$$

Since \(I - \varPhi '({\bar{y}})\) is a bijection, this is equivalent to (6.2d).

The uniqueness of p and \(\mu \) follows from the injectivity of \(B^\star \) and the bijectivity of \((I - \varPhi '({\bar{y}}))^\star \). \(\square \)

The approach of [8, Section 6.1] can be used to provide strong stationarity systems under less restrictive assumptions on K, i.e., the polyhedricity assumption can be replaced by the twice epi-differentiability of the indicator function \(\delta _K\).

In the case that \(\varPhi \) is merely Bouligand differentiable, cf. Remark 5.6, conditions (6.2a) and (6.2d) could be rewritten as

$$\begin{aligned} J_y(\cdot ) + A'({\bar{y}})^\star \, p + {\hat{\mu }} = 0, \qquad {\hat{\mu }} \in \big [(I - \varPhi '({\bar{y}};\cdot ))^{-1} \mathcal {K}_K({\bar{z}}, \bar{\lambda })\big ]^\circ , \end{aligned}$$
(6.5)

see the proof of Theorem 6.1.

Finally, we show that the system of strong stationarity is of reasonable strength, i.e., it implies the B-stationarity (6.4).

Lemma 6.2

Let \(({\bar{y}}, {\bar{u}})\) be a feasible point of (6.1) such that Assumption 5.1 is satisfied at \({\bar{f}} := B \, {\bar{u}} + f\). Moreover, suppose that J is Fréchet differentiable. If there exist multipliers \(p \in V\), \(\mu \in V^\star \) satisfying (6.2), then (6.4) holds.

Proof

For an arbitrary \(h \in U\) we define \(x := S'(B \, {\bar{u}} + f; B \, h)\). Then

$$\begin{aligned}&\langle J_y(\cdot ),x\rangle _{V^\star ,V} + \langle J_u(\cdot ),h\rangle _{U^\star ,U}\\&\qquad = \langle -A'({\bar{y}})^\star \, p - (I - \varPhi '(\bar{y}))^\star \mu ,x\rangle _{V^\star ,V} + \langle B^\star p,h\rangle _{U^\star ,U} \\&\qquad = -\,\langle p,A'({\bar{y}}) \, x - B \, h\rangle _{V^\star ,V} -\langle \mu ,(I - \varPhi '({\bar{y}})) \, x\rangle _{V^\star ,V} . \end{aligned}$$

From the linearized QVI (5.2) and the strong stationarity system (6.2), we have

$$\begin{aligned} (I - \varPhi '({\bar{y}})) \, x&\in \mathcal {K}_K({\bar{z}}, {\bar{\lambda }}),&A'(\bar{y}) \, x - B \, h&\in - \mathcal {K}_K({\bar{z}}, {\bar{\lambda }})^\circ , \\ p&\in -\mathcal {K}_K({\bar{z}}, {\bar{\lambda }}),&\mu&\in \mathcal {K}_K({\bar{z}}, \bar{\lambda })^\circ , \end{aligned}$$

where we used (6.3). Thus, (6.4) follows. \(\square \)