1 Introduction

We study bi-level stochastic linear programs with random right-hand side in the lower-level constraint system. The sequential nature of bi-level programming motivates a setting where the leader decides nonanticipatorily, while the follower can observe the realization of the randomness. A discussion of the related literature is provided in the recent [1]. A central result of [1] states that evaluating the leader’s random outcome by taking the expectation leads to a continuously differentiable functional if the underlying probability measure is absolutely continuous w.r.t. the Lebesgue measure. This allows to formulate first-order necessary optimality conditions for the risk-neutral model. The main result of the present work provides sufficient conditions, namely boundedness of the support and uniform boundedness of the Lebesgue density of the underlying probability measure, that ensure Lipschitz continuity of the gradient of the expectation functional. Moreover, we show that the assumptions of [1] are too weak to even guarantee local Lipschitz continuity of the gradient. By the main result, second-order necessary and sufficient optimality conditions can be formulated in terms of generalized Hessians. As part of the preparatory work for the proof of the main result, we in particular show that any region of strong stability in the sense of [1, Definition 4.1] is a finite union of polyhedral cones. This representation is of independent interest, as it may facilitate the calculation or estimation of gradients of the expectation functional and thus enhance gradient descent-based approaches. The paper is organized as follows: The model and related results of [1] are discussed in Sect. 2, while the main result and a variation with weaker assumptions are formulated in Sect. 3. Sections 4 and 5 are dedicated to geometric properties of regions of strong stability and related projections that appear in the representation of the gradient. Results of these sections play an important role in the proof of the main result that is given in Sect. 6. A second-order sufficient optimality condition is formulated in Sect. 7. The paper concludes with a brief discussion of the results and an outlook in Sect. 8.

2 Model and Notation

Consider the optimistic formulation of a parametric bi-level linear program

$$\begin{aligned} \min _x \left\{ c^\top x + \min _y \{q^\top y \; :\; y \in \varPsi (x,z)\} \; :\; x \in X \right\} , \end{aligned}$$
(1)

where \(z \in {\mathbb {R}}^s\) is a parameter and the data comprise a nonempty polyhedron \(X \subseteq {\mathbb {R}}^n\), vectors \(c \in {\mathbb {R}}^n\), \(q \in {\mathbb {R}}^m\) and the lower-level optimal solution set mapping \(\varPsi : {\mathbb {R}}^n \times {\mathbb {R}}^s \rightrightarrows {\mathbb {R}}^m\) defined by

$$\begin{aligned} \varPsi (x,z) := \underset{y}{\mathrm {Argmin}} \; \{d^\top y \; :\; Ay \le Tx + z\} \end{aligned}$$

with \(A \in {\mathbb {R}}^{s \times m}\), \(T \in {\mathbb {R}}^{s \times n}\) and \(d \in {\mathbb {R}}^m\). By [1, Lemma 2.1], the extended real-valued mapping \(f: {\mathbb {R}}^n \times {\mathbb {R}}^s \rightarrow \overline{{\mathbb {R}}} := {\mathbb {R}} \cup \lbrace \pm \infty \rbrace \) given by

$$\begin{aligned} f(x,z) := c^\top x + \min _y \lbrace q^\top y \; :\; y \in \varPsi (x,z) \rbrace . \end{aligned}$$

is real valued and Lipschitz continuous on the polyhedron

$$\begin{aligned} F = \{(x,z) \in {\mathbb {R}}^n \times {\mathbb {R}}^s \; :\; \exists y \in {\mathbb {R}}^m: Ay \le Tx + z\} \end{aligned}$$

if \(\mathrm {dom} \; f\) is nonempty. Let \(Z: \Omega \rightarrow {\mathbb {R}}^s\) be a random vector on some probability space \((\Omega , {\mathcal {F}}, {\mathbb {P}})\) and denote the induced Borel probability measure by \(\mu _Z = {\mathbb {P}} \circ Z^{-1} \in {\mathcal {P}}({\mathbb {R}}^s)\). Furthermore, we introduce the set

$$\begin{aligned} F_Z := \lbrace x \in {\mathbb {R}}^n \; :\; (x,z) \in F \; \forall z \in \mathrm {supp} \; \mu _Z \rbrace . \end{aligned}$$

If \(\mathrm {dom} \; f\) is nonempty and we impose the moment condition

$$\begin{aligned} \mu _Z \in {\mathcal {M}}^1_s := \left\{ \mu \in {\mathcal {P}}({\mathbb {R}}^s) \; :\; \int _{{\mathbb {R}}^s} \Vert z\Vert ~\mu (\mathrm{d}z) < \infty \right\} , \end{aligned}$$

the mapping \({\mathbb {F}}: F_Z \rightarrow L^1(\Omega , {\mathcal {F}}, {\mathbb {P}})\) given by \({\mathbb {F}}(x) := f(x,Z(\cdot ))\) is well defined and Lipschitz continuous by [1, Lemma 2.4]. In a situation where the parameter z in (1) is given by a realization of the random vector Z that the follower can observe while the leader has to decide x nonanticipatorily, the upper-level outcome can be modeled by \({\mathbb {F}}(x)\). If we assume \(X \subseteq F_Z\) and the leader’s decision is based on the expectation, we obtain the risk-neutral stochastic program

$$\begin{aligned} \min _x \left\{ {\mathbb {E}}[{\mathbb {F}}(x)] \; :\; x \in X \right\} . \end{aligned}$$
(2)

The following is shown in [1, Theorem 3.1, Corollary 4.7]:

Theorem 2.1

Assume \(\mathrm {dom} \; f \ne \emptyset \) and that \(\mu _Z \in {\mathcal {M}}^1_s\) is absolutely continuous w.r.t. the Lebesgue measure. Then, the mapping \({\mathcal {Q}}_{\mathbb {E}}: F_Z \rightarrow {\mathbb {R}}\) defined by \({\mathcal {Q}}_{\mathbb {E}}(x) = {\mathbb {E}}[{\mathbb {F}}(x)]\) is well defined, Lipschitz continuous and continuously differentiable at any \(x_0 \in \mathrm {int} \; F_Z\).

We shall discuss some key ideas of the proof and introduce the relevant notation: Set

$$\begin{aligned} {\hat{q}} := \begin{pmatrix} q \\ -q \\ 0_s \end{pmatrix}, \; {\hat{y}} := \begin{pmatrix} y_+ \\ y_- \\ t \end{pmatrix}, \; {\hat{d}} := \begin{pmatrix} d \\ -d \\ 0_s \end{pmatrix}, \; \text {and} \; {\hat{A}} := (A,-A,I_s), \end{aligned}$$

and then, f admits the representation

$$\begin{aligned} f(x,z) = c^\top x + \min _{{\hat{y}}} \big \{ {\hat{q}}^\top {\hat{y}} \; :\; {\hat{y}} \in \underset{{\hat{y}}'}{\mathrm {Argmin}} \lbrace {\hat{d}}^\top {\hat{y}}' \; :\; {\hat{A}}{\hat{y}}' = Tx+z, \; {\hat{y}}' \ge 0 \rbrace \big \}. \end{aligned}$$
(3)

Remark 2.1

The subsequent analysis does not depend on the specific structure of \({\hat{q}}, {\hat{d}}\) and \({\hat{A}}\) and applies whenever (3) holds with some matrix \({\hat{A}}\) satisfying \(\mathrm {rank} \; {\hat{A}} = s\).

As the rows of \({\hat{A}}\) are linearly independent, the set

$$\begin{aligned} {\mathcal {A}} := \lbrace {\hat{A}}_B \in {\mathbb {R}}^{s \times s} \; :\; {\hat{A}}_B \; \text {is a regular submatrix of } {\hat{A}} \rbrace \end{aligned}$$

of lower-level base matrices is nonempty. A base matrix \({\hat{A}}_B \in {\mathcal {A}}\) is optimal for the lower-level problem for a given (xz) if it is feasible, i.e.,

\({\hat{A}}_B^{-1}(Tx+z) \ge 0\), and the associated reduced cost vector \({\hat{d}}_N^\top - {\hat{d}}_B^\top {\hat{A}}_B^{-1} {\hat{A}}_N\) is nonnegative. Furthermore, for any optimal base matrix \({\hat{A}}_{B'} \in {\mathcal {A}}\), there exists a feasible base matrix \({\hat{A}}_B \in {\mathcal {A}}\) satisfying

$$\begin{aligned} {\hat{A}}_{B'}^{-1}(Tx+z) = {\hat{A}}_{B}^{-1}(Tx+z) \; \; \text {and} \; \; {\hat{d}}_N^\top - {\hat{d}}_B^\top {\hat{A}}_B^{-1} {\hat{A}}_N \ge 0. \end{aligned}$$

Set

$$\begin{aligned} {\mathcal {A}}^*:= \lbrace {\hat{A}}_B \in {\mathcal {A}} \; :\; {\hat{d}}_N^\top - {\hat{d}}_B^\top {\hat{A}}_B^{-1} {\hat{A}}_N \ge 0 \rbrace \end{aligned}$$

and assume \(\mathrm {dom} \; f \ne \emptyset \), and then,

$$\begin{aligned} f(x,z) = c^\top x + \min _{{\hat{A}}_B} \big \{ {\hat{q}}^\top _B {\hat{A}}_B^{-1}(Tx+z) \; :\; {\hat{A}}_B^{-1}(Tx+z) \ge 0, \; {\hat{A}}_B \in {\mathcal {A}}^*\big \} \end{aligned}$$

holds for any \((x,z) \in F\). A key concept is the region of strong stability associated with a base matrix \({\hat{A}}_{B} \in {\mathcal {A}}^*\) given by the set

$$\begin{aligned} {\mathcal {S}}({\hat{A}}_B) := \lbrace (x,z) \in F \; :\; {\hat{A}}_B^{-1}(Tx + z) \ge 0, \; c^\top x + {\hat{q}}^\top _B {\hat{A}}_B^{-1}(Tx+z) = f(x,z) \rbrace , \end{aligned}$$

on which f coincides with the affine linear mapping

$$\begin{aligned} f(x,z) = c^\top x + {\hat{q}}^\top _B {\hat{A}}_B^{-1}(Tx+z). \end{aligned}$$

Under the assumptions of Theorem 2.1, we have

$$\begin{aligned} F = \bigcup _{{\hat{A}}_B \in {\mathcal {A}}^*} {\mathcal {S}}({\hat{A}}_B) \end{aligned}$$

and the gradient of \({\mathcal {Q}}_{\mathbb {E}}\) admits the representation

$$\begin{aligned} \nabla {\mathcal {Q}}_{{\mathbb {E}}}(x) = c^\top + \sum _{\varDelta \in D} \mu _Z[{\mathcal {W}}(x,\varDelta )] \varDelta \; \; \forall x \in \mathrm {int} \; F_Z \end{aligned}$$
(4)

where \(D := \lbrace {\hat{q}}_B^\top {\hat{A}}_B^{-1} T \; :\; {\hat{A}}_B \in {\mathcal {A}}^*\rbrace \), and the set-valued aggregation mappings \({\mathcal {W}}, \overline{{\mathcal {W}}}: {\mathbb {R}}^n \times D \rightrightarrows {\mathbb {R}}^s\) are given by

$$\begin{aligned}&{\mathcal {W}}(x,\varDelta ) :=\left\{ z \in {\mathbb {R}}^s \; \ \; (x,z) \in \bigcup _{{\hat{A}}_B \in {\mathcal {A}}^*: \; {\hat{q}}_B^\top {\hat{A}}_B^{-1} T = \varDelta } \mathrm {int} \; {\mathcal {S}}(A_B)\right\} \\ \text {and} \; \;&\overline{{\mathcal {W}}}(x,\varDelta ) := \left\{ z \in {\mathbb {R}}^s \; \ \; (x,z) \in \bigcup _{{\hat{A}}_B \in {\mathcal {A}}^*: \; {\hat{q}}_B^\top {\hat{A}}_B^{-1} T = \varDelta } \mathrm {cl} \; \mathrm {int} \; {\mathcal {S}}(A_B)\right\} , \end{aligned}$$

respectively (cf. [1, Theorem 4.3, Corollary 4.7]). Continuity of the \(\nabla {\mathcal {Q}}_{{\mathbb {E}}}\) follows from the fact that the outer semicontinuity of \(\overline{{\mathcal {W}}}\) and

$$\begin{aligned} \sum _{\varDelta \in D} \mu _Z[\overline{{\mathcal {W}}}(x,\varDelta )] = 1 \; \; \forall (x, \varDelta ) \in \mathrm {int} \; F_Z\times D \end{aligned}$$

imply continuity of the weight functional \(M_\varDelta : {\mathbb {R}}^n \rightarrow {\mathbb {R}}\),

$$\begin{aligned} M_\varDelta (x) := \mu _Z[{\mathcal {W}}(x,\varDelta )] = \mu _Z[\overline{{\mathcal {W}}}(x,\varDelta )]. \end{aligned}$$
(5)

for any \(\varDelta \in D\).

3 Main Result

We shall first show that the assumptions of Theorem 2.1 are too weak to guarantee Lipschitz continuity \(\nabla {\mathcal {Q}}_{\mathbb {E}}\).

Example 3.1

Consider the case where

$$\begin{aligned} {\hat{d}} = (0,0,0,0)^\top , \; {\hat{A}} = \begin{pmatrix} 1 &{}0 &{}1 &{}1 \\ 0 &{}1 &{}\frac{3}{2} &{} \frac{1}{2} \end{pmatrix} \; \; \text {and} \; \; T = (0,1)^\top . \end{aligned}$$

The feasible set of the lower-level problem is compact for any parameters in the polyhedral cone \(F = \lbrace (x,z) \in {\mathbb {R}} \times {\mathbb {R}}^2 \; :\; z_1 \ge 0, \; x + z_2 \ge 0 \rbrace \), which implies that \(\mathrm {dom} \; f\) coincides with F for any \({\hat{q}} \in {\mathbb {R}}^4\). As the objective function is constant, any feasible base matrix is optimal for the lower-level problem. Denote the elements of \({\mathcal {A}} = {\mathcal {A}}^*\) by \({\hat{A}}_1, \ldots , {\hat{A}}_6\), and let

$$\begin{aligned} \varTheta _i = \lbrace (x,z) \in {\mathbb {R}}^n \times {\mathbb {R}}^s \; :\; {\hat{A}}_i^{-1}(Tx + z) \ge 0 \rbrace \end{aligned}$$

be the set of parameters for which \({\hat{A}}_i\) is feasible for the lower-level problem. A straightforward calculation shows that we have

$$\begin{aligned} {\hat{A}}_1 = \begin{pmatrix} 1 &{}0 \\ 0 &{}1 \end{pmatrix}, \; {\hat{A}}_1^{-1} = \begin{pmatrix} 1 &{}0 \\ 0 &{}1 \end{pmatrix}, \; \varTheta _1 = \lbrace (x,z) \in {\mathbb {R}} \times {\mathbb {R}}^2 \; :\; z_1 \ge 0, \; x + z_2 \ge 0 \rbrace , \\ {\hat{A}}_2 = \begin{pmatrix} 1 &{}1 \\ 0 &{}\frac{3}{2} \end{pmatrix}, \; {\hat{A}}_2^{-1} = \begin{pmatrix} 1 &{}-\frac{2}{3} \\ 0 &{}\frac{2}{3} \end{pmatrix}, \; \varTheta _2 = \lbrace (x,z) \in {\mathbb {R}} \times {\mathbb {R}}^2 \; :\; 0 \le x + z_2 \le \frac{3}{2}z_1 \rbrace , \\ {\hat{A}}_3 = \begin{pmatrix} 1 &{}1 \\ 0 &{}\frac{1}{2} \end{pmatrix}, \; {\hat{A}}_3^{-1} = \begin{pmatrix} 1 &{}-2 \\ 0 &{}2 \end{pmatrix}, \; \varTheta _3 = \lbrace (x,z) \in {\mathbb {R}} \times {\mathbb {R}}^2 \; :\; 0 \le x + z_2 \le \frac{1}{2} z_1 \rbrace , \\ {\hat{A}}_4 = \begin{pmatrix} 0 &{}1 \\ 1 &{}\frac{3}{2} \end{pmatrix}, \; {\hat{A}}_4^{-1} = \begin{pmatrix} -\frac{3}{2} &{}1 \\ 1 &{}0 \end{pmatrix}, \; \varTheta _4 = \lbrace (x,z) \in {\mathbb {R}} \times {\mathbb {R}}^2 \; :\; 0 \le \frac{3}{2} z_1 \le x + z_2 \rbrace \\ {\hat{A}}_5 = \begin{pmatrix} 0 &{}1 \\ 1 &{}\frac{1}{2} \end{pmatrix}, \; {\hat{A}}_5^{-1} = \begin{pmatrix} -\frac{1}{2} &{}1 \\ 1 &{}0 \end{pmatrix}, \; \varTheta _5 = \lbrace (x,z) \in {\mathbb {R}} \times {\mathbb {R}}^2 \; :\; 0 \le \frac{1}{2} z_1 \le x + z_2\rbrace \end{aligned}$$

and

$$\begin{aligned} {\hat{A}}_6 = \begin{pmatrix} 1 &{}1 \\ \frac{3}{2} &{}\frac{1}{2} \end{pmatrix}, {\hat{A}}_6^{-1} = \begin{pmatrix} -\frac{1}{2} &{}1 \\ \frac{3}{2} &{}-1 \end{pmatrix}, \varTheta _6 = \lbrace (x,z) \in {\mathbb {R}} \times {\mathbb {R}}^2 \; :\; \frac{1}{2} z_1 \le x + z_2 \le \frac{3}{2} z_1 \rbrace . \end{aligned}$$

Set \({\hat{q}} = (0,0,-5,-3)^\top \), and let \({\hat{q}}_i\) denote the part of upper-level objective function that is associated with \({\hat{A}}_i\). We have

$$\begin{aligned}&{\hat{q}}_1^\top {\hat{A}}_1^{-1}T = 0, \; \; \; {\hat{q}}_2^\top {\hat{A}}_2^{-1}T = -\frac{10}{3}, \; \; \; {\hat{q}}_3^\top {\hat{A}}_3^{-1}T = -6, \\&{\hat{q}}_4^\top {\hat{A}}_4^{-1}T = 0, \; \; \; {\hat{q}}_5^\top {\hat{A}}_5^{-1}T = 0, \; \; \; {\hat{q}}_6^\top {\hat{A}}_6^{-1}T = -2 \end{aligned}$$

and a straightforward calculation yields

$$\begin{aligned} {\mathcal {S}}({\hat{A}}_1)&= \lbrace (x,z) \in {\mathbb {R}} \times {\mathbb {R}}^2 \; :\; z_1 = 0, \; x + z_2 \ge 0 \rbrace \\&\cup \lbrace (x,z) \in {\mathbb {R}} \times {\mathbb {R}}^2 \; :\; z_1 \ge 0, \; x + z_2 = 0 \rbrace \\&= \mathrm {bd} \; \varTheta _1, \\ {\mathcal {S}}({\hat{A}}_2)&= \lbrace (x,z) \in {\mathbb {R}} \times {\mathbb {R}}^2 \; :\; 0 = x + z_2 \le \frac{3}{2}z_1 \rbrace \\&\cup \lbrace (x,z) \in {\mathbb {R}} \times {\mathbb {R}}^2 \; :\; 0 \le x + z_2 = \frac{3}{2}z_1 \rbrace \\&= \mathrm {bd} \; \varTheta _2, \\ {\mathcal {S}}({\hat{A}}_3)&= \varTheta _3, \; \; {\mathcal {S}}({\hat{A}}_4) = \varTheta _4, \; \; {\mathcal {S}}({\hat{A}}_6) = \varTheta _6 \; \; \text {and} \\ {\mathcal {S}}({\hat{A}}_5)&= \lbrace (x,z) \in {\mathbb {R}} \times {\mathbb {R}}^2 \; :\; 0 = \frac{1}{2} z_1 \le x + z_2\rbrace \\&\cup \lbrace (x,z) \in {\mathbb {R}} \times {\mathbb {R}}^2 \; :\; 0 \le \frac{1}{2} z_1 = x + z_2\rbrace \\&= \mathrm {bd} \; \varTheta _5. \end{aligned}$$

Let the density \(\delta _Z: {\mathbb {R}}^2 \rightarrow {\mathbb {R}}\) of Z be given by

$$\begin{aligned} \delta _Z(t_1,t_2) = \left\{ \begin{array}{ll} \frac{1}{2\sqrt{\frac{1}{2}t_1 - t_2}}, &{}\quad \text {if} \; 3 \le t_1 \le 4 \; \text {and} \; \frac{1}{2}t_1 - 1 \le t_2 < \frac{1}{2}t_1\\ 0, &{}\quad \text {else} \end{array} \right. \end{aligned}$$

and set \(c=0\). We have \(\mathrm {supp} \; \mu _Z \subset \overline{{\mathcal {W}}}(0,-6)\), and it is easy to see that

$$\begin{aligned}&\overline{{\mathcal {W}}}(x,-6) \cap \mathrm {supp} \; \mu _Z = \lbrace z \in {\mathbb {R}}^2 \; :\; 3 \le z_1, \le 4, \; \frac{1}{2}z_1 - 1 \le z_2 \le \frac{1}{2}z_1 - x \rbrace \\ \text {and} \;&\overline{{\mathcal {W}}}(x,-2) \cap \mathrm {supp} \; \mu _Z = \lbrace z \in {\mathbb {R}}^2 \; :\; 3 \le z_1, \le 4, \; \frac{1}{2}z_1 - x \le z_2 < \frac{1}{2}z_1 \rbrace \end{aligned}$$

hold true whenever \(x \in ]0,1]\) (Fig. 1).

Fig. 1
figure 1

In Fig. 1, the darker square depicts the intersection of \(\mathrm {supp} \; \mu _Z\) and \(\overline{{\mathcal {W}}}(\frac{1}{4}, -2)\), while the lighter square is \(\overline{{\mathcal {W}}}(\frac{1}{4}, -6) \cap \mathrm {supp} \; \mu _Z\). The distance between the dotted lines is \(x = \frac{1}{4}\)

Thus, \(\mathrm {supp} \; \mu _Z \subset \overline{{\mathcal {W}}}(x,-6) \cup \overline{{\mathcal {W}}}(x,-2)\) holds for any \(x \in [0,1]\) and a simple calculation shows that

$$\begin{aligned} \nabla {\mathcal {Q}}_{\mathbb {E}}(x)&= -6\mu _Z[{\overline{W}}(x,-6)] -2\mu _Z[{\overline{W}}(x,-2)] \\&= -6(1-\sqrt{x}) - 2\sqrt{x} = 4\sqrt{x} - 6 \end{aligned}$$

is not locally Lipschitz continuous at \(x=0\).

Our main result is the following sufficient conditions for Lipschitz continuity of \(\nabla {\mathcal {Q}}_{\mathbb {E}}\):

Theorem 3.1

Assume \(\mathrm {dom} \; f \ne \emptyset \) and let \(\mu _Z\) be absolutely continuous w.r.t. the Lebesgue measure and have a bounded support as well as a uniformly bounded density. Then, \({\mathcal {Q}}_{\mathbb {E}}\) is differentiable on \(\mathrm {int} \; F_Z\) with Lipschitz continuous gradient.

Note the density in Example 3.1 is not bounded. The proof of Theorem 3.1 requires some preliminary work and will be given in Sect. 6. If the support of \(\mu _Z\) is unbounded, we still obtain a weaker estimate for the gradients:

Theorem 3.2

Assume \(\mathrm {dom} \; f \ne \emptyset \) and let \(\mu _Z\) be absolutely continuous w.r.t. the Lebesgue measure and have a uniformly bounded density. Then, \({\mathcal {Q}}_{\mathbb {E}}\) is differentiable on \(\mathrm {int} \; F_Z\) and for any \(\epsilon > 0\) there exists a constant \(L(\epsilon ) > 0\) such that

$$\begin{aligned} |\nabla {\mathcal {Q}}_{{\mathbb {E}}}(x) - \nabla {\mathcal {Q}}_{{\mathbb {E}}}(x')| \le L(\epsilon ) \Vert x-x'\Vert + \epsilon \end{aligned}$$

holds for all \(x,x' \in \mathrm {int} \; F_Z\).

4 On the Geometry of Regions of Strong Stability

In view of (4) and (5), the gradient \(\nabla {\mathcal {Q}}_{{\mathbb {E}}}(x)\) is given by a weighted sum of the probabilities of the sets \({\mathcal {W}}(x,\varDelta )\) or \(\overline{{\mathcal {W}}}(x,\varDelta )\) for \(\varDelta \in D\). As these sets are defined using regions of strong stability, we shall first study properties of the sets \({\mathcal {S}}({\hat{A}}_B)\) with \({\hat{A}}_{B} \in {\mathcal {A}}^*\).

Remark 4.1

Example 3.1 shows that regions of strong stability are not convex in general.

Proposition 4.1

Assume \(\mathrm {dom} \; f \ne \emptyset \), then

$$\begin{aligned} {\mathcal {S}}({\hat{A}}_B) = {\mathcal {S}}({\hat{A}}_B) \; + \; \mathrm {ker} (T,I_s) \end{aligned}$$

holds for any \({\hat{A}}_B \in {\mathcal {A}}^*\).

Proof

The above result immediately follows from the fact that the quantities involved in the definition of \({\mathcal {S}}({\hat{A}}_B)\) only depend on \(Tx + z\). \(\square \)

Corollary 4.1

Assume \(\mathrm {dom} \; f \ne \emptyset \) and \(n \ge 1\), then no region of strong stability has any extremal points.

Proof

Let (xz) be an arbitrary point of some region of strong stability \({\mathcal {S}}({\hat{A}}_B)\). The n-dimensional kernel of \((T,I_s)\) contains some nonzero element \((x_0,z_0)\), and we have \((x-x_0, z-z_0), (x+x_0, z+z_0) \in {\mathcal {S}}({\hat{A}}_B)\) by Proposition 4.1. Thus, \((x,z) = \frac{1}{2}(x-x_0, z-z_0) + \frac{1}{2}(x+x_0, z+z_0)\) is no extremal point of \({\mathcal {S}}({\hat{A}}_B)\). \(\square \)

Our main result on the structure of \({\mathcal {S}}({\hat{A}}_B)\) is the following:

Theorem 4.1

Assume \(\mathrm {dom} \; f \ne \emptyset \), then any region of strong stability is a union of at most \((s+1)^{|{\mathcal {A}}^*|}\) polyhedral cones and at most \((s+1)^{|{\mathcal {A}}^*| - 1}\) of these cones have a nonempty interior. Moreover, the multifunction \({\mathcal {S}}: {\mathcal {A}}^*\rightrightarrows {\mathbb {R}}^n \times {\mathbb {R}}^s\) is polyhedral, i.e., \(\mathrm {gph} \; {\mathcal {S}}\) is a finite union of polyhedra.

Before we get to proof of Theorem 4.1, we will establish the following auxiliary result:

Lemma 4.1

Let \({\mathcal {W}} := \lbrace \xi \in {\mathbb {R}}^k \; :\; V \xi < 0 \rbrace \) with \(V \in {\mathbb {R}}^{l \times k}\) be nonempty, then

$$\begin{aligned} \mathrm {cl} \; {\mathcal {W}} = \overline{{\mathcal {W}}} := \lbrace \xi \in {\mathbb {R}}^k \; :\; V \xi \le 0 \rbrace . \end{aligned}$$

Proof

The inclusion \(\mathrm {cl} \; {\mathcal {W}} \subseteq \overline{{\mathcal {W}}}\) is trivial. Moreover, for any \(\xi _0 \in {\mathcal {W}} = \mathrm {int} \; \overline{{\mathcal {W}}}\) and \(\xi \in \overline{{\mathcal {W}}}\) the line segment principle (cf. [6, Lemma 2.1.6]) implies \([\xi _0, \xi ) \subseteq {\mathcal {W}}\) and thus \(\xi \in \mathrm {cl} \; {\mathcal {W}}\). \(\square \)

We are now ready to prove Theorem 4.1.

Proof

(Proof of Theorem 4.1) Denote the elements of the finite set \({\mathcal {A}}^*\) by \({\hat{A}}_1, \dots , {\hat{A}}_l\) and the associated parts of the objective function by \({\hat{q}}_1, \dots , {\hat{q}}_l\). Fix any index \(i \in \lbrace 1, \ldots , l \rbrace \); then for any \((x,z) \in F\) satisfying \({\hat{A}}_i^{-1}(Tx + z) \ge 0\), the constraint \(c^\top x + {\hat{q}}_i^\top {\hat{A}}_i^{-1}(Tx + z) = f(x,z)\) in the definition of \({\mathcal {S}}({\hat{A}}_i)\) can be reformulated as

$$\begin{aligned} \left( \big ( {\hat{q}}_i^\top {\hat{A}}_i^{-1} - {\hat{q}}_j^\top {\hat{A}}_j^{-1} \big )\big (Tx + z\big ) \le 0 \; \; \vee \; \; {\hat{A}}_j^{-1}(Tx + z) \ngeq 0 \right) \; \; \forall j = 1, \ldots , l. \end{aligned}$$

Introducing the sets

$$\begin{aligned} \varTheta _{i}&:= \lbrace (x,z) \in F \; :\; {\hat{A}}_i^{-1}(Tx + z) \ge 0 \rbrace \\&{:}= \lbrace (x,z) \in {\mathbb {R}}^n \times {\mathbb {R}}^s \; :\; {\hat{A}}_i^{-1}(Tx + z) \ge 0 \rbrace , \\ \Gamma _{ij0}&:= \lbrace (x,z) \in {\mathbb {R}}^n \times {\mathbb {R}}^s \; :\; \big ( {\hat{q}}_i^\top {\hat{A}}_i^{-1} - {\hat{q}}_j^\top {\hat{A}}_j^{-1} \big )\big (Tx + z\big ) \le 0 \rbrace \; \; \text {and} \\ \Gamma _{ijk}&:= \lbrace (x,z) \in {\mathbb {R}}^n \times {\mathbb {R}}^s \; :\; e_k^\top {\hat{A}}_j^{-1}(Tx + z) < 0 \rbrace \end{aligned}$$

with indices \(j =1, \ldots , l\) and \(k =1, \ldots , s\) and using the fact that \({\mathcal {S}}({\hat{A}}_i)\) is closed by the Lipschitz continuity of f, we obtain the representation

$$\begin{aligned} {\mathcal {S}}({\hat{A}}_i) \;&= \; \mathrm {cl} \; \left( \varTheta _i \; \cap \; \bigcap _{j=1, \ldots , l} \; \bigcup _{k=0, \ldots , s} \Gamma _{ijk} \right) \\&= \; \bigcup _{\begin{array}{c} \alpha \in \lbrace 0, \ldots , s \rbrace ^l, \\ \varTheta _i \; \cap \; \bigcap _{j=1,\ldots , l} \Gamma _{ij\alpha _j} \ne \emptyset \end{array}} \mathrm {cl} \; \left( \varTheta _i \; \cap \; \bigcap _{\begin{array}{c} j=1,\ldots , l, \\ \alpha _j = 0 \end{array}} \Gamma _{ij0} \; \cap \; \bigcap _{\begin{array}{c} j=1,\ldots , l, \\ \alpha _j \ne 0 \end{array}} \Gamma _{ij\alpha _j} \right) . \end{aligned}$$

As \(\varTheta _i \; \cap \; \bigcap _{j=1,\ldots , l, \; \alpha _j = 0} \Gamma _{ij0}\) is convex and closed, while \(\bigcap _{j=1,\ldots , l, \; \alpha _j \ne 0} \Gamma _{ij\alpha _j}\) is convex and open, [6, Proposition 2.1.10] yields

$$\begin{aligned} {\mathcal {S}}({\hat{A}}_i) \;&= \; \bigcup _{\begin{array}{c} \alpha \in \lbrace 0, \ldots , s \rbrace ^l, \\ \varTheta _i \; \cap \; \bigcap _{j=1,\ldots , l} \Gamma _{ij\alpha _j} \ne \emptyset \end{array}} \left( \varTheta _i \; \cap \; \bigcap _{\begin{array}{c} j=1,\ldots , l, \\ \alpha _j = 0 \end{array}} \Gamma _{ij0} \; \cap \; \mathrm {cl} \; \bigcap _{\begin{array}{c} j=1,\ldots , l, \\ \alpha _j \ne 0 \end{array}} \Gamma _{ij\alpha _j} \right) \end{aligned}$$

The sets \(\varTheta _i \; \cap \; \bigcap _{j=1,\ldots , l, \; \alpha _j = 0} \Gamma _{ij0}\) are obviously polyhedral cones, and Lemma 4.1 implies

$$\begin{aligned}&\mathrm {cl} \; \bigcap _{\begin{array}{c} j=1,\ldots , l, \\ \alpha _j \ne 0 \end{array}} \Gamma _{ij\alpha _j} \\&\quad = \; \left\{ (x,z) \in {\mathbb {R}}^n \times {\mathbb {R}}^s \; :\; e_{\alpha _j}^\top {\hat{A}}_j^{-1}(Tx + z) \le 0 \; \forall j = 1, \ldots , l: \alpha _j \ne 0 \right\} \\&\quad = \; \bigcap _{\begin{array}{c} j=1,\ldots , l, \\ \alpha _j \ne 0 \end{array}} \mathrm {cl} \; \Gamma _{ij\alpha _j}. \end{aligned}$$

Moreover, for any \(\alpha _i \in \lbrace 1, \ldots , s \rbrace \) we have \(e_{\alpha _i}^\top {\hat{A}}_i^{-1} \ne 0\) and thus

$$\begin{aligned}&\mathrm {int} \; \left( \varTheta _i \cap \bigcap _{j=1, \ldots , l} \mathrm {cl} \; \Gamma _{ij\alpha _j} \right) \\&\quad \subseteq \; \mathrm {int} \; \left( \varTheta _i \cap \mathrm {cl} \; \Gamma _{ii\alpha _i} \right) \\&\quad \subseteq \; \mathrm {int} \left\{ (x,z) \in {\mathbb {R}}^n \times {\mathbb {R}}^s \; :\; (e_{\alpha _i}^\top {\hat{A}}_i^{-1}T, e_{\alpha _i}^\top {\hat{A}}_i^{-1}) \begin{pmatrix} x \\ z \end{pmatrix} = 0\right\} \\&\quad = \; \emptyset . \end{aligned}$$

The second part of the theorem is an immediate consequence of the finiteness of \({\mathcal {A}}^*\). \(\square \)

Corollary 4.2

Assume \(\mathrm {dom} \; f \ne \emptyset \), then any region of strong stability is star shaped and contains the n-dimensional kernel of \((T,I_s)\).

Proof

Radial convexity is an immediate consequence of Theorem 4.1, as any region of strong stability contains the line segments from the origin to any feasible point. The second statement directly follows from Proposition 4.1. \(\square \)

Two-stage stochastic programming can be understood as the special case of bi-level stochastic programming where the objectives of leader and follower coincide. In this case, any region of strong stability is a polyhedral cone and thus convex:

Proposition 4.2

Assume \(\mathrm {dom} \; f \ne \emptyset \) and \({\hat{q}} = \alpha {\hat{d}}\) for some \(\alpha > 0\). Then, any region of strong stability is a polyhedral cone.

Proof

We shall use the notation of the proof of Theorem 4.1 and denote the part of \({\hat{d}}\) associated with \({\hat{A}}_i\) by \({\hat{d}}_i\). Fix any \((x,z) \in F\) and consider any base matrices \({\hat{A}}_i, \hat{A_j} \in {\mathcal {A}}^*\) that are feasible and thus optimal for the lower-level problem. As

$$\begin{aligned} {\hat{q}}_i^\top {\hat{A}}_i^{-1}(Tx+z) = \alpha {\hat{d}}_i^\top {\hat{A}}_i^{-1}(Tx+z) = \alpha {\hat{d}}_j^\top {\hat{A}}_j^{-1}(Tx+z) = {\hat{q}}_j^\top {\hat{A}}_j^{-1}(Tx+z), \end{aligned}$$

both base matrices are also optimal with respect to the upper-level objective function. Thus, \({\mathcal {S}}({\hat{A}}_i)\) coincides with the polyhedral cone \(\varTheta _i\). \(\square \)

Remark 4.2

As \({\hat{d}} = (0,0,0,0)^\top \) holds in Example 3.1, we see the assumption \({\hat{q}} = \alpha {\hat{d}}\) for some \(\alpha \in {\mathbb {R}}\) in Proposition 4.2 cannot be replaced with the weaker condition that \(\lbrace {\hat{q}}, {\hat{d}} \rbrace \) is linearly dependent.

5 Properties of the Aggregation Mappings

We shall now study the aggregation mappings \({\mathcal {W}}\) and \(\overline{{\mathcal {W}}}\) defined in Sect. 2. The following result is the counterpart of Theorem 4.1:

Theorem 5.1

Assume \(\mathrm {dom} \; f \ne \emptyset \), then the multifunction \(\overline{{\mathcal {W}}}\) is polyhedral. Moreover, \(\overline{{\mathcal {W}}}(x,\varDelta )\) is a finite union of polyhedra for any \((x, \varDelta ) \in {\mathbb {R}}^n \times D\).

The proof of Theorem 5.1 will be based on the following auxiliary result:

Lemma 5.1

Let \(C_1, \ldots , C_l \subseteq {\mathbb {R}}^k\) be closed and convex. Then,

$$\begin{aligned} \mathrm {cl} \; \mathrm {int} \; \bigcup _{i=1, \ldots , l} C_i = \bigcup _{i=1, \ldots , l: \; \mathrm {int} \; C_i \ne \emptyset } C_i. \end{aligned}$$

Proof

As the sets \(C_1, \ldots , C_l\) are closed and the interior of a union is contained in the union of the interiors, we have

$$\begin{aligned}&\bigcup _{i=1, \ldots , l: \; \mathrm {int} \; C_i \ne \emptyset } C_i \; \supseteq \; \mathrm {cl} \; \mathrm {int} \; \bigcup _{i=1, \ldots , l: \; \mathrm {int} \; C_i \ne \emptyset } C_i \; \supseteq \; \mathrm {cl} \; \bigcup _{i=1, \ldots , l: \; \mathrm {int} \; C_i \ne \emptyset } \mathrm {int} \; C_i \\ \\&\quad = \; \bigcup _{i=1, \ldots , l: \; \mathrm {int} \; C_i \ne \emptyset } \mathrm {cl} \; \mathrm {int} \; C_i \; = \; \bigcup _{i=1, \ldots , l: \; \mathrm {int} \; C_i \ne \emptyset } C_i, \end{aligned}$$

where the first equality is due to the fact that the closure of a union equals the union of the closures and the second equation is a direct consequence of the line segment principle. Thus,

$$\begin{aligned} \mathrm {cl} \; \mathrm {int} \; \bigcup _{i=1, \ldots , l} C_i \; \supseteq \; \mathrm {cl} \; \mathrm {int} \; \bigcup _{i=1, \ldots , l: \; \mathrm {int} \; C_i \ne \emptyset } C_i \; = \; \bigcup _{i=1, \ldots , l: \; \mathrm {int} \; C_i \ne \emptyset } C_i. \end{aligned}$$

For the reverse inclusion, suppose that there is some

$$\begin{aligned} x \in \left( \mathrm {cl} \; \mathrm {int} \bigcup _{i=1, \ldots , l} C_i \right) \Bigg \backslash \bigcup _{i=1, \ldots , l: \; \mathrm {int} \; C_i \ne \emptyset } C_i. \end{aligned}$$

By definition, there are sequences \(\lbrace x_n \rbrace _{n \in {\mathbb {N}}} \subset {\mathbb {R}}^k\) and \(\lbrace \epsilon _n \rbrace _{n \in {\mathbb {N}}} \subset {\mathbb {R}}_{>0}\) satisfying \(x_n \rightarrow x\) and \(B_{\epsilon _n}(x_n) \subseteq \bigcup _{i=1, \ldots , l} C_i\) for all \(n \in {\mathbb {N}}\). As \(\bigcup _{i=1, \ldots , l: \; \mathrm {int} \; C_i \ne \emptyset } C_i\) is closed, there exists some \(N \in {\mathbb {N}}\) such that \(x_n \notin \bigcup _{i=1, \ldots , l: \; \mathrm {int} \; C_i \ne \emptyset } C_i\) for all \(n \ge N\). Together with the previous considerations, the strong separation theorem (cf. [9, Theorem 11.4]) yields the existence of some \(\delta _N \in (0,\epsilon _N]\) such that

$$\begin{aligned} B_{\delta _N}(x_N) \subseteq \left( \bigcup _{i=1, \ldots , l} C_i \right) \Bigg \backslash \bigcup _{i=1, \ldots , l: \; \mathrm {int} \; C_i \ne \emptyset } C_i \; = \bigcup _{i=1, \ldots , l: \; \mathrm {int} \; C_i = \emptyset } C_i. \end{aligned}$$

As any \(C_i\) with \(\mathrm {int} \; C_i = \emptyset \) is contained in an affine subspace of dimension strictly smaller than k (cf. [2, Section 2.5.2]), we obtain the contradiction

$$\begin{aligned} \emptyset \; \ne \; \mathrm {int} \; B_{\delta _N}(x_N) \; \subseteq \; \mathrm {int} \bigcup _{i=1, \ldots , l: \; \mathrm {int} \; C_i = \emptyset } C_i \; = \; \emptyset . \end{aligned}$$

Thus,

$$\begin{aligned} \left( \mathrm {cl} \; \mathrm {int} \bigcup _{i=1, \ldots , l} C_i \right) \Bigg \backslash \bigcup _{i=1, \ldots , l: \; \mathrm {int} \; C_i \ne \emptyset } C_i \; = \; \emptyset , \end{aligned}$$

which completes the proof. \(\square \)

Corollary 5.1

Let \(C \subseteq {\mathbb {R}}^k\) be a finite union of polyhedra (polyhedral cones). Then, \(\mathrm {cl} \; \mathrm {int} \; C\) is a finite union of polyhedra (polyhedral cones).

Proof

The above statement is an immediate consequence of Lemma 5.1. \(\square \)

Proof

(Proof of Theorem 5.1) As D is finite, it is sufficient to consider the multifunctions \(\overline{{\mathcal {W}}}(\cdot , \varDelta ): {\mathbb {R}}^n \rightrightarrows {\mathbb {R}}^s\) for fixed \(\varDelta \in D\). We have

$$\begin{aligned} \mathrm {gph} \; \overline{{\mathcal {W}}}(\cdot , \varDelta ) = \bigcup _{{\hat{A}}_B \in {\mathcal {A}}^*: \; {\hat{q}}_B^\top {\hat{A}}_B^{-1}T = \varDelta } \mathrm {cl} \; \mathrm {int} \; {\mathcal {S}}({\hat{A}}_B), \end{aligned}$$

which is a finite union of polyhedra by Corollary 5.1. Similarly, \(\overline{{\mathcal {W}}}(x,\varDelta )\) admits the representation

$$\begin{aligned} \overline{{\mathcal {W}}}(x,\varDelta ) = \bigcup _{{\hat{A}}_B \in {\mathcal {A}}^*: \; {\hat{q}}_B^\top {\hat{A}}_B^{-1}T = \varDelta } \lbrace z \in {\mathbb {R}}^s \; :\; (x,z) \in \mathrm {cl} \; \mathrm {int} \; {\mathcal {S}}({\hat{A}}_B) \rbrace . \end{aligned}$$

By Theorem 4.1 and Corollary 5.1, the set

$$\begin{aligned} \lbrace z \in {\mathbb {R}}^s \; :\; (x,z) \in \mathrm {cl} \; \mathrm {int} \; {\mathcal {S}}({\hat{A}}_B) \rbrace \end{aligned}$$

is the intersection of a finite union of polyhedral cones and the affine subspace \(\lbrace (x',z') \in {\mathbb {R}}^n \times {\mathbb {R}}^s \; :\; x' = x \rbrace \) and thus a finite union of polyhedral cones for any \(x \in {\mathbb {R}}^n\) and any \({\hat{A}}_B \in {\mathcal {A}}^*\). \(\square \)

The following result on \({\mathcal {W}}\) is a simple consequence of the fact that the constraint system describing a region of strong stability only imposes conditions on \((Tx+z)\).

Proposition 5.1

Assume \(\mathrm {dom} \; f \ne \emptyset \), then

$$\begin{aligned} {\mathcal {W}}(x,\varDelta ) = {\mathcal {W}}(x',\varDelta ) + \lbrace T(x-x') \rbrace \end{aligned}$$

holds for any \(x,x' \in {\mathbb {R}}^n\) and \(\varDelta \in D\).

Proof

Fix any \(x, x' \in {\mathbb {R}}^n\), \(z \in {\mathbb {R}}^s\) and set \(z' = z + T(x-x')\), then \(Tx + z = Tx' + z'\) and thus

$$\begin{aligned} f(x,z) - c^\top x&= \min _y \big \{ q^\top y \; :\; y \in \underset{y'}{\mathrm {Argmin}} \lbrace d^\top y' \; :\; Ay' \le Tx + z \rbrace \big \} \\&= f(x',z') - c^\top x'. \end{aligned}$$

Similarly, for any \({\hat{A}}_B \in {\mathcal {A}}^*\), \((x,z) \in {\mathcal {S}}({\hat{A}}_B)\) holds if and only if

  1. 1.

    there exists some \(y \in {\mathbb {R}}^m\) such that \(Ay \le Tx + z = Tx' + z'\),

  2. 2.

    \({\hat{A}}_B^{-1}(Tx' + z') = {\hat{A}}_B^{-1}(Tx + z) \ge 0\) and

  3. 3.

    \({\hat{q}}_B^\top {\hat{A}}_B^{-1}(Tx' + z') = {\hat{q}}_B^\top {\hat{A}}_B^{-1}(Tx + z) = f(x,z) - c^\top x = f(x',z') - c^\top x',\)

i.e., if and only if \((x',z') \in {\mathcal {S}}({\hat{A}}_B)\). We conclude that

$$\begin{aligned}&z \in {\mathcal {W}}(x, \varDelta ) \; \Leftrightarrow \; \exists {\hat{A}}_B \in {\mathcal {A}}^*: \; \varDelta = {\hat{q}}_B^\top {\hat{A}}_B^{-1}T, \; (x,z) \in \mathrm {int} \; {\mathcal {S}}({\hat{A}}_B) \\&\quad \Leftrightarrow \; \exists {\hat{A}}_B \in {\mathcal {A}}^*: \; \varDelta = {\hat{q}}_B^\top {\hat{A}}_B^{-1}T, \; (x',z') \in \mathrm {int} \; {\mathcal {S}}({\hat{A}}_B) \\&\quad \Leftrightarrow \; z' \in {\mathcal {W}}(x', \varDelta ) \end{aligned}$$

holds for any \(\varDelta \in D\), which completes the proof. \(\square \)

6 Proof of the Main Result

We are finally ready to prove Theorem 3.1 based on the results of Sects. 4 and 5 as well as the two following auxiliary results:

Lemma 6.1

Assume \(\mathrm {dom} \; f \ne \emptyset \), and let \(\mu _Z \in {\mathcal {P}}({\mathbb {R}}^s)\) be absolutely continuous w.r.t. the Lebesgue measure, then

$$\begin{aligned} \mu _Z\left[ {\mathcal {W}}(x,\varDelta ) \setminus \big ( {\mathcal {W}}(x,\varDelta ) + \lbrace t \rbrace \big ) \right] \le \mu _Z\left[ \overline{{\mathcal {W}}}(x,\varDelta ) \setminus \big ( \overline{{\mathcal {W}}}(x,\varDelta ) + \lbrace t \rbrace \big ) \right] \end{aligned}$$

holds for any \(x\in {\mathbb {R}}^n\), \(\varDelta \in D\) and \(t \in {\mathbb {R}}^s\).

Proof

By the arguments used in the proof of [1, Lemma 4.2], we have

$$\begin{aligned} {\mathcal {W}}(x,\varDelta ) \subseteq \overline{{\mathcal {W}}}(x,\varDelta ) \subseteq {\mathcal {W}}(x,\varDelta ) \cup {\mathcal {N}}_x, \end{aligned}$$

where \({\mathcal {N}}_x \subset {\mathbb {R}}^s\) is contained in a finite union of hyperplanes. Consequently,

$$\begin{aligned}&{\mathcal {W}}(x,\varDelta ) \setminus \big ( {\mathcal {W}}(x,\varDelta ) + \lbrace t \rbrace \big ) \\&\quad \subseteq \; \Big [ \overline{{\mathcal {W}}}(x,\varDelta ) \setminus \big ( \overline{{\mathcal {W}}}(x,\varDelta ) + \lbrace t \rbrace \big ) \Big ] \cup \Big [ \big ( \overline{{\mathcal {W}}}(x,\varDelta ) + \lbrace t \rbrace \big ) \setminus \big ( {\mathcal {W}}(x,\varDelta ) + \lbrace t \rbrace \big ) \Big ] \\&\quad = \; \Big [ \overline{{\mathcal {W}}}(x,\varDelta ) \setminus \big ( \overline{{\mathcal {W}}}(x,\varDelta ) + \lbrace t \rbrace \big ) \Big ] \cup \Big [ \big ( \overline{{\mathcal {W}}}(x,\varDelta ) \setminus {\mathcal {W}}(x,\varDelta ) \big ) + \lbrace t \rbrace \Big ] \\&\quad \subseteq \; \Big [ \overline{{\mathcal {W}}}(x,\varDelta ) \setminus \big ( \overline{{\mathcal {W}}}(x,\varDelta ) + \lbrace t \rbrace \big ) \Big ] \cup \Big [ {\mathcal {N}}_x + \lbrace t \rbrace \Big ] \end{aligned}$$

and the above statement is a direct consequence of the fact that the Lebesgue measure of \({\mathcal {N}}_x\) equals zero. \(\square \)

Lemma 6.2

Assume \(\mathrm {dom} \; f \ne \emptyset \) and let \(\mu _Z\) be absolutely continuous w.r.t. the Lebesgue measure and have a bounded support as well as a uniformly bounded density. Then, the weight functional \(M_\varDelta \) is Lipschitz continuous on \(\mathrm {int} \; F_Z\) for any \(\varDelta \in D\).

Proof

By definition of \({\mathcal {W}}(x,\varDelta )\), Proposition 5.1 and Lemma 6.1,

$$\begin{aligned}&|\mu _Z\big [{\mathcal {W}}(x,\varDelta )\big ] - \mu _Z\big [{\mathcal {W}}(x',\varDelta )\big ]| \\&\quad = \; |\mu _Z\Big [{\mathcal {W}}(x,\varDelta )\Big ] - \mu _Z\Big [{\mathcal {W}}(x,\varDelta ) + \big \{ T(x'-x) \big \}\Big ]| \\&\quad \le \; \mu _Z \left[ {\mathcal {W}}(x,\varDelta ) \setminus \Big ( {\mathcal {W}}(x,\varDelta ) + \big \{ T(x'-x) \big \} \Big ) \right] \\&\qquad + \; \mu _Z \left[ \Big ( {\mathcal {W}}(x,\varDelta ) + \big \{ T(x'-x) \big \} \Big ) \setminus {\mathcal {W}}(x,\varDelta ) \right] \\&\quad \le \; \mu _Z \left[ \overline{{\mathcal {W}}}(x,\varDelta ) \setminus \Big ( \overline{{\mathcal {W}}}(x,\varDelta ) + \big \{ T(x'-x) \big \} \Big ) \right] \\&\qquad + \; \mu _Z \left[ \Big ( \overline{{\mathcal {W}}}(x,\varDelta ) + \big \{ T(x'-x) \big \} \Big ) \setminus \overline{{\mathcal {W}}}(x,\varDelta ) \right] \end{aligned}$$

holds for any fixed \(\varDelta \in D\). As both

$$\begin{aligned} \overline{{\mathcal {W}}}(x,\varDelta ) \setminus \Big ( \overline{{\mathcal {W}}}(x,\varDelta ) + \big \{ T(x'-x) \big \} \Big ) \; \; \text {and} \; \; \Big ( \overline{{\mathcal {W}}}(x,\varDelta ) + \big \{ T(x'-x) \big \} \Big ) \setminus \overline{{\mathcal {W}}}(x,\varDelta ) \end{aligned}$$

are contained in

$$\begin{aligned} {\mathcal {H}}_{x, x'} := \big \{ v + l \cdot T(x'-x) \; :\; v \in \mathrm {bd} \; \overline{{\mathcal {W}}}(x,\varDelta ), \; l \in [-1,1] \big \} \end{aligned}$$

and there exists a finite upper bound \(\alpha \in {\mathbb {R}}\) for the Lebesgue density of \(\mu _Z\), we have

$$\begin{aligned} |\mu _Z\big [{\mathcal {W}}(x,\varDelta )\big ] - \mu _Z\big [{\mathcal {W}}(x',\varDelta )\big ]| \; \le \; 2\alpha \lambda ^s\big [ {\mathcal {H}}_{x,x'} \cap \mathrm {supp} \; \mu _Z \big ], \end{aligned}$$

where \(\lambda ^s\) denotes the s-dimensional Lebesgue measure. By Theorem 5.1, the boundary of \(\overline{{\mathcal {W}}}(x,\varDelta )\) is contained in a finite union of lower-dimensional polyhedra. Let \({\mathbb {H}}_x\) denote a set of such cones with minimal cardinality. It is a straightforward conclusion from the proofs of Theorem 4.1, Theorem 5.1 and Lemma 5.1 that the cardinality of \({\mathbb {H}}_x\) can be bounded by a constant \(K \in {\mathbb {N}}\) that does not depend on x. Moreover, as any \(H \in {\mathbb {H}}_x\) is contained in some hyperplane, the \(s-1\)-dimensional Lebesgue measure of \(H \cap \mathrm {supp} \; \mu _Z\) is at most \(\mathrm {diam}(\mathrm {supp} \; \mu _Z)^{s-1}\). Thus,

$$\begin{aligned}&|\mu _Z\big [{\mathcal {W}}(x,\varDelta )\big ] - \mu _Z\big [{\mathcal {W}}(x',\varDelta )\big ]| \\&\quad \le \; 2\alpha \sum _{H \in {\mathbb {H}}_x} \lambda ^s\big [ \lbrace v + l \cdot T(x'-x) \; :\; v \in H, \; l \in [-1,1] \rbrace \cap \mathrm {supp} \; \mu _Z \big ] \\&\quad \le \; 4 \alpha K \cdot \mathrm {diam}(\mathrm {supp} \; \mu _Z )^{s-1} \cdot \Vert T \Vert _{{\mathcal {L}}({\mathbb {R}}^n, {\mathbb {R}}^s)} \Vert x'-x\Vert , \end{aligned}$$

by Cavalieri’s principle, which completes the proof. \(\square \)

Proof

(Proof of Theorem 3.1) Continuous differentiability on \(\mathrm {int} \; F_Z\) is a direct consequence of [1, Corollary 4.7]. Fix any \(x, x' \in \mathrm {int} \; F_Z\); then, (4) and Lemma 6.2 yield

$$\begin{aligned}&\Vert \nabla {\mathcal {Q}}_{\mathbb {E}}(x) - \nabla {\mathcal {Q}}_{\mathbb {E}}(x')\Vert \\&\quad \le \; \sum _{\varDelta \in D} |\mu _Z\big [{\mathcal {W}}(x,\varDelta )\big ] - \mu _Z\big [{\mathcal {W}}(x',\varDelta )\big ]| \cdot \Vert \varDelta \Vert \\&\quad \le \; 4 \alpha K |D| \cdot \mathrm {diam}(\mathrm {supp} \; \mu _Z )^{s-1} \cdot \max _{\varDelta \in D} \; \Vert \varDelta \Vert \cdot \Vert T \Vert _{{\mathcal {L}}({\mathbb {R}}^n, {\mathbb {R}}^s)} \Vert x'-x\Vert \end{aligned}$$

and thus the desired Lipschitz continuity. \(\square \)

Proof

(Proof of Theorem 3.2) Fix any \(\kappa > 0\). As \(\mu _Z\) is tight by [3, Theorem 1.3], there exists a compact set \(C(\kappa ) \subset {\mathbb {R}}^s\) such that \(\mu _Z[{\mathbb {R}}^s \setminus C(\kappa )] < \kappa \). Combining this with the estimate from the first part of the proof of Lemma 6.2 and using the same notation established therein, we see that

$$\begin{aligned} |\mu _Z[{\mathcal {W}}(x,\varDelta )] - \mu _Z[{\mathcal {W}}(x,\varDelta )]|&\le 2 \mu _Z[{\mathcal {H}}_{x,x'}] \\&= \; 2 \mu _Z[{\mathcal {H}}_{x,x'} \cap C(\kappa )] + 2 \mu _Z[{\mathcal {H}}_{x,x'} \setminus C(\kappa )] \\&\le \; 2 \alpha \lambda ^s[{\mathcal {H}}_{x,x'} \cap C(\kappa ) \cap \mathrm {supp} \; \mu _Z] + 2\kappa \end{aligned}$$

holds for any \(\varDelta \in D\). Thus,

$$\begin{aligned}&|\mu _Z\big [{\mathcal {W}}(x,\varDelta )\big ] - \mu _Z\big [{\mathcal {W}}(x',\varDelta )\big ]| \\&\quad \le \; 2\kappa + 2\alpha \sum _{H \in {\mathbb {H}}_x} \lambda ^s\big [ \lbrace v + l \cdot T(x'-x) \; :\; v \in H, \; l \in [-1,1] \rbrace \cap C(\kappa ) \cap \mathrm {supp} \; \mu _Z \big ] \\&\quad \le \; 2 \kappa + 4 \alpha K \cdot \mathrm {diam}(C(\kappa ) \cap \mathrm {supp} \; \mu _Z )^{s-1} \cdot \Vert T \Vert _{{\mathcal {L}}({\mathbb {R}}^n, {\mathbb {R}}^s)} \Vert x'-x\Vert . \end{aligned}$$

We therefore have

$$\begin{aligned}&\Vert \nabla {\mathcal {Q}}_{\mathbb {E}}(x) - \nabla {\mathcal {Q}}_{\mathbb {E}}(x')\Vert \\&\quad \le \; \sum _{\varDelta \in D} |\mu _Z\big [{\mathcal {W}}(x,\varDelta )\big ] - \mu _Z\big [{\mathcal {W}}(x',\varDelta )\big ]| \cdot \Vert \varDelta \Vert \\&\quad \le \; 4 \alpha K |D| \mathrm {diam}(C(\kappa ) \cap \mathrm {supp} \; \mu _Z )^{s-1} \max _{\varDelta \in D} \; \Vert \varDelta \Vert \cdot \Vert T \Vert _{{\mathcal {L}}({\mathbb {R}}^n, {\mathbb {R}}^s)} \Vert x'-x\Vert + 2|D|\kappa \end{aligned}$$

and choosing \(\kappa = \frac{\epsilon }{2|D|}\) yields the desired estimate. \(\square \)

Remark 6.1

The constant \(L(\epsilon )\) derived in the proof of Theorem 3.2 depends on \(\epsilon \). If the support of \(\mu _Z\) is unbounded, we have \(L(\epsilon ) \rightarrow \infty \) as \(\epsilon \downarrow 0\).

7 A Sufficient Second-Order Optimality Condition

Under the conditions of Theorem 3.1, \(\nabla {\mathcal {Q}}_{{\mathbb {E}}}\) is Lipschitz continuous on \(\mathrm {int} \; F_Z\) and thus differentiable almost everywhere on \(\mathrm {int} \; F_Z\) by Rademacher’s theorem. Let \({\mathcal {D}} \subseteq \mathrm {int} \; F_Z\) denote the set of points at which \(\nabla {\mathcal {Q}}_{{\mathbb {E}}}\) is differentiable, then generalized Clarke’s Hessian of \({\mathcal {Q}}_{\mathbb {E}}\) at some \(x \in \mathrm {int} \; F_Z\) is the nonempty, convex and compact set

$$\begin{aligned} \partial ^2 {\mathcal {Q}}_{\mathbb {E}}(x) = \mathrm {conv} \left\{ H \in {\mathbb {R}}^{n \times n} \; :\; \exists \lbrace x_k \rbrace _{k \in {\mathbb {N}}} \subseteq { {\mathcal {D}}}: \; x_k \rightarrow x, \; \nabla ^2 {\mathcal {Q}}_{\mathbb {E}}(x_k) \rightarrow H \right\} . \end{aligned}$$

We have \(\partial ^2 {\mathcal {Q}}_{\mathbb {E}}(x) = \lbrace \nabla ^2 {\mathcal {Q}}_{\mathbb {E}}(x) \rbrace \) whenever \(x \in {\mathcal {D}}\).

Let the feasible set of (2) be given by \(X = \lbrace x \in {\mathbb {R}}^n \; :\; Bx \le b \rbrace \) with some \(B \in {\mathbb {R}}^{k \times n}\) and \(b \in {\mathbb {R}}^k\). The following second-order sufficient condition is based on [7]:

Theorem 7.1

Assume \(\mathrm {dom} \; f \ne \emptyset \), \(X \subseteq \mathrm {int} \; F_Z\) and let \(\mu _Z\) be absolutely continuous w.r.t. the Lebesgue measure and have a bounded support as well as a uniformly bounded density. Moreover, let \(({\bar{x}},{\bar{u}})\) be a KKT point of (2), i.e.,

$$\begin{aligned} \nabla {\mathcal {Q}}_{\mathbb {E}}({\bar{x}}) + B^\top {\bar{u}} = 0, \; B{\bar{x}} \le b, \; {\bar{u}}^\top (B{\bar{x}} - b) = 0, \; {\bar{u}} \ge 0 \end{aligned}$$

and assume that any \(H \in \partial ^2 {\mathcal {Q}}_{\mathbb {E}}({\bar{x}})\) is positive definite on

$$\begin{aligned} \left\{ h \in {\mathbb {R}}^n \; :\; \begin{matrix} e_i^\top B h = 0 \; \forall i: \; {\bar{u}}_i > 0 \\ e_j^\top B h \le 0 \; \forall j: \; {\bar{u}}_j = e_j^\top B {\bar{x}} = 0 \end{matrix} \right\} . \end{aligned}$$

Then, \({\bar{x}}\) is a strict local minimizer with order 2 of (2), i.e., there exist a neighborhood U of \({\bar{x}}\) and a constant \(L > 0\) such that

$$\begin{aligned} {\mathcal {Q}}_{\mathbb {E}}(x) > {\mathcal {Q}}_{\mathbb {E}}({\bar{x}}) + L \Vert x-{\bar{x}}\Vert ^2 \end{aligned}$$

holds for any \(x \in X \cap U\).

Proof

This is a straightforward conclusion from [7, Theorem 1]. \(\square \)

Remark 7.1

There are various other approaches for optimization problems with data in the class \(C^{1,1}\), which consists of differentiable functions with locally Lipschitzian gradients. For instance, second-order optimality conditions can also be formulated based on Dini (cf. [5, Section 4.4]) or Riemann (cf. [8]) derivatives.

8 Conclusions

We have derived sufficient conditions for Lipschitz continuity of the gradient of the expectation functional arising from a bi-level stochastic linear program with random right-hand side in the lower-level constraint system. Invoking the structure of the upper level constraints, we used this result to formulate a second-order sufficient optimality condition for the risk-neutral bi-level stochastic program in terms of the generalized Hessian of \({\mathcal {Q}}_{\mathbb {E}}\). Moreover, the main result on the geometry of regions of strong stability and its counterpart for the aggregation mapping \(\overline{{\mathcal {W}}}\) may facilitate the computation or sample-based estimation of gradients of the expectation functional, which enhances gradient descent-based methods. As any region of strong stability is a finite union of polyhedral cones, a promising approach is to employ spherical radial decomposition techniques to calculate \(\nabla {\mathcal {Q}}_{\mathbb {E}}\) (cf. [4, Chapter 4]). The details are beyond the scope of this paper but shall be addressed in future research.