Abstract
The method of sliding modes (relaxation) was originally invented in optimal control in order to give a transparent proof of the maximum principle (a first-order necessary condition for a strong local minimum) using the local maximum principle (a first-order necessary condition for a weak local minimum). In the present work, we use this method to derive second-order necessary conditions for a strong local minimum on the base of such conditions for a weak local minimum. For simplicity, we confine ourselves to the consideration of the Mayer problem with endpoint equality and inequality constraints and control inequality constraints given by a finite number of twice smooth functions. Assuming that the gradients of active control constraints are linearly independent, we provide a rather short proof of second-order necessary conditions for a strong local minimum.
Similar content being viewed by others
1 Introduction
The theory of second-order optimality conditions for different types of minima (strong, weak and the so-called Pontryagin) in optimal control (OC) is well-developed. It is associated with the names of Bonnans, Dmitruk, Frankowska, Hestenes, Ioffe, Malanowski, Maurer, Milyutin, Osmolovskii, Warga, Zeidan, and many others. We refer the interested reader to, e.g., [1,2,3,4,5] for historical comments and bibliographical remarks. Rather complete and advanced results were obtained by the Moscow group headed by Milyutin, to which belongs the author of the present publication. The main distinguishing feature of these results is that there is no gap between necessary and sufficient conditions. The conditions have the character of sign definiteness of a quadratic form (which, in simple cases, is the second variation of the Lagrange function) on the so-called critical cone. The type of the minimum affects only the choice of Lagrange multipliers involved in the conditions.
The necessary conditions, to which this work is devoted, were stated by the author (together with the relevant sufficient conditions) back in 1978; see [1, Supplement to Chapter VI, S2]. Later, much more general results were obtained in the author’s thesis [6]. Second-order conditions in [6] were obtained for problems with regular mixed state–control constraints, considered on a fixed or nonfixed time interval. Moreover, conditions in [6] were obtained for various types of minimum, and they took into account possible breaks of the first kind of the optimal control, if any. The necessary conditions, contained in [6], together with their proofs, were published in [7]. (The relevant sufficient conditions were published in [8].) But the proofs, presented in [7], are complex and long due to the generality of the obtained results. Therefore, the question of getting simpler proofs, let for partial results, is still pertinent. This we suppose to do in the present work.
Recent years have been marked by renewed interest in second-order conditions in OC. Some progress was due to the fact that an arbitrary compact set was considered as a control constraint that was not specified using a finite number of smooth inequalities, see, e.g., [3, 9,10,11]. To work with such a control constraint, the authors used the transition from a control system to differential inclusion, and then applied the differential calculus apparatus for multivalued mappings, developed by Aubin and Frankowska in [12]. These ideas allowed us to obtain the necessary second-order condition for a strong local minimum for a problem with an arbitrary compact control set and a finite number of inequality-type endpoint constraints (see [10]), and then for a problem with a finite number of inequality-type state constraints, in the absence of any qualification assumptions for constraints (see [11]). Moreover, the conditions in [10] and [11] were obtained for an arbitrary measurable optimal control. Unfortunately, the developed approach did not allow us to immediately include in the problem the terminal constraints of the equality type.
A new interesting approach to obtain necessary conditions for a strong local minimum in OC has been proposed by Ioffe in [13]. First-order necessary conditions for a strong local minimum in the form of the maximum principle (MP) were obtained in [13] for an OC problem with state, control and endpoint constraints, for a system, controlled by a differential inclusion, and under fairly general assumptions (in particular, the endpoint constraint was specified by an arbitrary closed set). The idea of the proof was completely based on the reduction in the original OC problem to a nonsmooth problem of Bolza type with the subsequent application of the necessary conditions for a strong local minimum in the latter.
This approach was further developed in [14], where conditions of both the first and second order for a strong local minimum were obtained for an OC problem with state constraints, Pontryagin standard dynamics, a control constraint U(t) with closed values, and a finite number of endpoint constraints. An important difference between [11] and [14] was that in [14], an OC problem was considered, containing not only endpoint constraints of the inequality type (as in [11]), but also endpoint constraints of the equality type. Moreover, in the necessary second-order condition in [14], there is a new quadratic term (the last term in inequality (21)), which is absent in traditional second-order conditions. The presence of this term determines the essential novelty of the necessary condition in [14], the effectiveness of which is confirmed by an example. At the same time, note that for the problem, considered in the present paper, the conditions from [14] (compared to the conditions of this paper) require an additional regularity assumption, associated with the endpoint constraint of the equality type.
It is worth noting that in all the mentioned works, including the present one, the idea of convexification of the right side of differential inclusion or differential equation was used one way or another.
In the present work, although the control constraint is not an arbitrary closed or compact set, we discuss second-order necessary conditions for a strong local minimum in a problem with endpoint constraints of both equality and inequality type, in the absence of any qualifying assumption for endpoint constraints, and again the reference control is an arbitrary bounded measurable function. Therefore, the present work can serve as a basis for further research.
As mentioned in the abstract, we use the sliding mode (relaxation) method to prove the necessary second-order conditions for a strong local minimum, based on the necessary second-order conditions for a weak local minimum. (A relatively short proof of the latter was given in [15].) The key role in the transition from conditions for a weak minimum to conditions for a strong one is played by the Dmitruk theorem [16, Theorem 1].
The paper is organized as follows. The main results are presented in Sect. 2. In particular, an OC problem is set in Sect. 2.1, the Lyusternik condition for the equality constraints of the OC problem and the concepts of the weak and strong local minima are recalled in Sect. 2.2, the second-order necessary conditions [15] for a weak local minimum are discussed in Sect. 2.3, and the second-order necessary conditions for a strong local minimum are formulated in Sect. 2.4 (see Theorem 2.2). Section 3 is entirely devoted to the proof of Theorem 2.2, which is the main one in the paper. Concluding remarks are given in Sect. 4.
2 First- and Second-Order Necessary Conditions in the Main Problem
2.1 Statement of the Main Problem
Denote by \(W^{1,1}([0,1],\mathrm{I\! R}^{d(x)})\) the Sobolev space of absolutely continuous functions \(x:[0,1]\rightarrow \mathrm{I\! R}^{d(x)}\) with the norm \(\Vert x(\cdot )\Vert _{1,1}=|x(0)|+\int _0^1|\dot{x}(t)|\mathrm{d}t\), and by \(L^\infty ([0,1],\mathrm{I\! R}^{d(u)})\), the space of measurable essentially bounded functions \(u:[0,1]\rightarrow \mathrm{I\! R}^{d(u)}\) with the norm \(\Vert u(\cdot )\Vert _{\infty }=\mathrm{ess\,sup}_{[0,1]}|u(t)|,\) where \(|\cdot |\) denotes the Euclidean norm. Hereafter, by d(a), we denote the dimension of the vector a. Define the space \(\mathcal{W}:=W^{1,1}([0,1],\mathrm{I\! R}^{d(x)})\times L^\infty ([0,1],\mathrm{I\! R}^{d(u)})\) with the norm of \(w(\cdot )=(x(\cdot ),u(\cdot ))\in \mathcal{W}\) given by \(\Vert w(\cdot )\Vert = \Vert x(\cdot )\Vert _{1,1}+\Vert u(\cdot )\Vert _{\infty }.\) In the sequel, for \((x,u)\in \mathcal{W}\), we set \(\xi _0=x(0)\), \(\xi _1=x(1)\) and \(\xi =(\xi _0,\xi _1).\)
Consider the Mayer OC problem in the space \(\mathcal{W}\):
where
The functions \(f:\mathrm{I\! R}^{{d(x)}} \times \mathrm{I\! R}^{d(u)}\rightarrow \mathrm{I\! R}^{d(x)}\), \(g:\mathrm{I\! R}^{d(u)}\rightarrow \mathrm{I\! R}^r\), \(F_i:\mathrm{I\! R}^{2{d(x)}}\rightarrow \mathrm{I\! R}\), \(i=0,\ldots ,k\), and \(K:\mathrm{I\! R}^{2{d(x)}}\rightarrow \mathrm{I\! R}^s\) are assumed to be twice continuously differentiable. We also assume that at every point \(u\in \mathrm{I\! R}^{d(u)}\) such that \(g(u)=~0\), the gradients \( g'_i(u)\), \(i\in I_g(u)\) are linearly independent, where
is the set of active indices at u. We call (1)–(3) the main problem.
2.2 Lyusternik Condition, Weak and Strong Minima in the Main Problem
Define a nonlinear operator \(G:\mathcal{W}\mapsto L^1([0,1],\mathrm{I\! R}^{d(x)})\times \mathrm{I\! R}^s\) (which corresponds to the equality-type constraints of the main problem) as follows:
This operator is continuously Fréchet differentiable, and its derivative at a point \(\hat{w}=(\hat{x},\hat{u})\in \mathcal{W}\) is a linear operator
where \(\hat{\xi }=(\hat{x}(0),\hat{x}(1))\), \(\xi =(x(0),x(1))\), \(w=(x,u)\).
Definition 2.1
Let a point \(\hat{w}=(\hat{x},\hat{u})\in \mathcal{W}\) satisfy the equality constraints of the main problem, i.e., \(G(\hat{w})=0.\) We say that the Lyusternik condition holds at \(\hat{w}\), if the operator \(G'(\hat{w})\) is surjective.
Any trajectory-control pair \((x,u)\in \mathcal{W}\) satisfying (2)–(3) is called admissible. Recall that a weak local minimum is a local minimum over admissible pairs in the space \(\mathcal{W}\). Further, an admissible \((\hat{x},\hat{u})\) is called a strong local minimizer , if there exists an \(\varepsilon >0\) such that \(J(x,u)\ge J(\hat{x},\hat{u})\) for any admissible \((x,u)\in \mathcal{W}\) such that \(\Vert x-\hat{x}\Vert _\infty <\varepsilon \). Obviously, any strong local minimizer is a weak local minimizer.
2.3 Second-Order Necessary Conditions for a Weak Local Minimum in the Main Problem
The Pontryagin (Hamiltonian) function and the terminal Lagrange function are defined, respectively, by
where \(p=(p_1,\ldots ,p_{d(x)})\), \(\alpha =(\alpha _0,\ldots ,\alpha _k)\), and \(\beta =(\beta _1,\ldots ,\beta _s)\) are considered as row vectors. The augmented Pontryagin (Hamiltonian) function has the form:
where \(\mu =(\mu _1,\ldots ,\mu _r)\) is a row vector.
Let \((\hat{x},\hat{u})\) be an admissible pair. Denote by \(\Lambda \) the set of all tuples \(\lambda =(\alpha ,\beta ,p,\mu )\) such that
If \((\hat{x},\hat{u})\) is a weak local minimum, then the set \(\Lambda \) is nonempty. This is the well-known first-order necessary condition for a weak local minimum. Since the gradients \(g_i'\) of active control constraints are linearly independent, each \(\lambda \in \Lambda \) is uniquely defined by its components \(\alpha ,\beta \). It follows that the equality \( |\alpha |+|\beta |=1\) is the normalization condition, and the set \(\Lambda \) does not contain a zero element. Moreover, it follows that \(\Lambda \) is a finite-dimensional compact set.
For the point \(\hat{w}=(\hat{x},\hat{u})\), define the critical cone\(\mathcal{C}\) as the set of all pairs \(w=(x,u)\in \mathcal{W}\) such that
where \(\mathcal{M}_{i0}=\{t\in [0,1]:\; g_i(\hat{u}(t))=0\}\), \(i=1,\ldots ,s\).
For any \(\lambda \in \Lambda \) and \(w=(x,u)\in \mathcal{W}\), we set
where \( \langle \bar{H}_{ww}w,w\rangle =\langle \bar{H}_{xx}x,x\rangle +2\langle \bar{H}_{xu}u,x\rangle +\langle \bar{H}_{uu}u,u\rangle ,\) and let
Here and in the sequel, \(\sup _\emptyset (\cdot )=-\infty .\) Note that the supremum in (5) is attained. A direct proof of the following result can be found in [15].
Theorem 2.1
If \(\hat{w}=(\hat{x},\hat{u})\) is a weak local minimum, then the set \(\Lambda \) is nonempty and
Here, (6) is a second-order necessary condition for a weak local minimum.
If the Lyusternik condition does not hold at \(\hat{w}\), then, it can be easily proved (see, e.g., [15]) that there exist \(p\in W^{1,\infty }([0,1],\mathrm{I\! R}^{d(x)}) \) and \(\beta \in \mathrm{I\! R}^s\) with \(|\beta |=1\) such that
Set \( \alpha =0, \quad \mu =0,\quad \lambda =(0,\beta ,p,0).\) Then, obviously \(\lambda \in \Lambda \) and \( -\lambda \in \Lambda .\) This implies the following lemma.
Lemma 2.1
If the Lyusternik condition does not hold at \(\hat{w}\), then
This simple lemma is an important complement to Theorem 2.1. We emphasize that if the Lyusternik condition does not hold, then the inequality \(\Omega _\Lambda (w)\ge 0\) holds for all \(w\in \mathcal{W}\), but not only for \(w\in \mathcal{C}\).
The idea of the proof of Theorem 2.1 in [15] is simple. Thanks to Lemma 2.1, we can assume that the Lyusternik condition holds at the point \(\hat{w}\) of the weak local minimum. Under this assumption, we consider the system of second-order approximations for the cost and constraints and, using the Lyusternik theorem, we prove that this system has an empty intersection. Then, we apply the separation theorem to this system.
Note that the necessary conditions for a weak local minimum in OC do not play such a dominant role, as in the calculus of variations, which is largely a theory of the weak local minimum. One of the reasons is that, in OC, we are dealing with a constraint of the form: \(u\in U\), which does not always allow us to use control variations that are small in absolute value. Say, if U consists of a finite number of elements, then such variations simply do not exist. Therefore, when studying the weak local minimum in OC, it is necessary to restrict ourselves to some special classes of sets U. For a long time, such a unique class was the class of sets represented in form (4). In the recent studies, this class has been significantly expanded. For example, results [9,10,11] can also be effectively applied if U is a cross, a star, etc.
The necessary conditions for a weak local minimum in OC should be considered rather as the first step in the analysis of the conditions for a strong minimum. This is precisely the role of Theorem 2.1. It is worth noting that the necessary condition (6) for a weak local minimum, contained in this theorem, cannot be regarded as complete and final, since its natural strengthening does not turn it into a sufficient condition for a weak local minimum; see [15] for details. However, it is the condition (6) (the wording of which is relatively simple) that is fundamental, when moving to a strong minimum. This ‘incomplete condition’ already leads to the desired result.
2.4 Second-Order Necessary Conditions for a Strong Local Minimum in the Main Problem
In Sect. 3, using Theorem 2.1, we will get a necessary second-order condition for a strong local minimum. We will use the same way of the proof as in [7, Chapter 4, Section 4.4]. (This approach was proposed by A.A. Milyutin in the 1980s.) The theorem, which we want to prove, is as follows:
Denote by M the set of all \(\lambda =(\alpha ,\beta ,p,\mu )\in \Lambda \) such that the minimum condition holds at the point \(\hat{u}\):
Set
The condition \(M\ne \emptyset \) is equivalent to the Pontryagin minimum principle, which is a necessary first-order condition for a strong local minimum. Note that M is a finite-dimensional compact set, and the supremum in (8) is attained.
Theorem 2.2
If \((\hat{x},\hat{u})\) is a strong local minimum, then the set M is nonempty and
A much more refined result for the more general OC problem was obtained in [7, Theorem 4.10], but, as earlier remarked, the proofs in [7] are long and complicated. Our aim now is to give a relatively simple proof of Theorem 2.2, based on Theorem 2.1 and using the so-called sliding mode (relaxation) method.
3 Proof of the Main Result
3.1 Refinement of Theorem 2.1
The following concept will be used to prove the main result.
We say that a weak s-necessity [1] holds at an admissible point \(\hat{w}=(\hat{x},\hat{u})\) of the main problem, if there is no sequence of admissible points \(w^n=(x^n,u^n)\), \(n=1,2,\ldots \) such that for all n
and \(\Vert w^n-\hat{w}\Vert \rightarrow 0\) as \(n\rightarrow \infty \).
Clearly, the weak local minimum implies the weak s-necessity.
A direct proof of the following result can be found in [15].
Theorem 3.1
If \(\hat{w}=(\hat{x},\hat{u})\) is a point of weak s-necessity or the Lyusternik condition does not hold at \(\hat{w}\), then the set \(\Lambda \) is nonempty and condition (6) holds.
Theorem 3.1 immediately implies Theorem 2.1. Theorem 3.1 is an important refinement of Theorem 2.1. We will use this refinement in the proof of the main result.
Unfortunately, in [15], Theorem 3.1 has not been formulated. Instead of this, Theorem 2.1 was formulated and proved as the main result. But from this proof, given in [15, Section 3], it easily follows that Theorem 3.1 also holds (see the proof of Lemma 3 in [15]).
3.2 Associated Problem, Sliding Modes
We shall use the following notation
where \(\mathcal {U}\) is the set of admissible controls that is
Let \(a=(u^1,\ldots , u^N)\in {\mathcal {A}}\). Along with the main problem, consider the so-called associated (or relaxed) problem, defined by (1), (2) and the relations
where \(u\in L^\infty ([0,1],\mathrm{I\! R}^{d(u)})\), \(v^i\in L^\infty ([0,1],\mathrm{I\! R})\), \(i=1,\ldots ,N\). In the new problem, the control is the tuple \((u,v^1,\ldots ,v^N)\), and x is the state variable.
Let us check the linear independence of the gradients of active inequality control constraints in the associated problem. As will be seen later, the constraint \(\sum _{i=1}^N v^i\le 1\) will always be inactive at the reference point, and therefore, when studying the weak minimum, it can be ignored. So, we consider only the constraints
The gradients of these constraints at the point \((u,v^1,\ldots ,v^N)\), considered as row vectors, are of the form
respectively. Take any reals \(\eta ^1\), \(\ldots \), \(\eta ^N\) and a vector \(\mu \in \mathrm{I\! R}^{d(u)}\). Suppose that the vector \(\mu \) has zero all components \(\mu _i\) that correspond to inactive constraints, i.e., the complimentary slackness conditions \(\mu _ig_i(u)=0\) hold for all i. Suppose that the combination of gradients (16) with some coefficients \(\mu \), \(\eta ^1\), \(\ldots \), \(\eta ^N\) is equal to zero. Then, obviously we get: \(\eta ^1=\cdots =\eta ^N=0\), and \(\mu g'(u)=0.\) The latter implies that \(\mu =0,\) since the gradients of active control constraints are linearly independent in the main problem. It means that the gradients of active control constraints are linearly independent in the associated problem as well.
Note that the associated problem (1), (2), (13)–(15) has the same type as the main problem (1)–(3). It is considered in the space
with elements \(z=(x,u,v^1,\ldots ,v^N)\) and the norm
The local minimum in this norm is a weak local minimum in the associated problem.
Let \(\hat{w}=(\hat{x},\hat{u})\in \mathcal{W}\) be an admissible point in the main problem. For this point, we define an admissible point \(\hat{z}\) in the associated problem such that \(x=\hat{x}\), \( u=\hat{u}, \quad v^1=\cdots = v^N=0,\) that is \(\hat{z}= (\hat{x},\hat{u},0,\ldots ,0)\). The weak s-necessity at the point \(\hat{z}\) in the associated problem means that there is no sequence \( z_n= ( x_n, u_n, v^1_n,\ldots , v^N_n)\), \( n=1,2,\ldots \), such that for all n
where \( \xi _n=( x_n(0), x_n(1))\), \(\hat{\xi }=(\hat{x}(0),\hat{x}(1))\).
The following important lemma will be proved in this section.
Lemma 3.1
If there exists \(a\in \mathcal A\) such that \(\hat{z}\) is not a point of weak s-necessity in the associated problem and the Lyusternik condition holds at this point in the associated problem, then \(\hat{w}\) is not a strong local minimum in the main problem.
The proof of this lemma will be based on the theorem of Dmitruk; see [16, Theorem 3]. Below, we give this theorem in a simplified version, convenient for application in our case.
Theorem 3.2
Let \(a=(u^1,\ldots , u^N)\in {\mathcal {A}}\). Suppose that the Lyusternik condition holds at a point \(\tilde{z}= (\tilde{x},\tilde{u},\tilde{v}^1,\ldots ,\tilde{v}^N),\) satisfying the equality constraints of the associated problem and such that
Then, there is a sequence of points \( z_n= ( x_n, u_n, v^1_n,\ldots , v^N_n)\), \( n=1,2,\ldots \), satisfying the equality constraints of the associated problem and such that
- (i)
\(\Vert x_n-\tilde{x}\Vert _\infty \rightarrow 0\) as \(n\rightarrow \infty \),
- (ii)
\(\Vert u_n-\tilde{u}\Vert _\infty \rightarrow 0\) as \(n\rightarrow \infty \),
- (iii)
each difference \( v_n^i-\tilde{v}^i\) converges weakly* to zero in \(L^\infty \) (i.e., \(L^1\)-weakly) as \(n\rightarrow \infty \), \(i=1,\ldots ,N\), and
- (iv)
each function \( v_n^i\) takes only two values, zero or one, and the same is true for each sum \(\sum _{i=1}^N v^i_n\).
(To get Theorem 3.2 from [16, Theorem 3] for some \(\hat{a}=(\hat{u}^1,\ldots ,\hat{u}^N)\in \mathcal A\) , we need to put \(g(x,u^i,t)=u^i-\hat{u}^i\), \(i=1,\ldots ,N\) in the definition of ‘extended system’ (4)–(7) in [16], and after that apply [16, Theorem 3] to this particular system.)
Proof of Lemma 3.1
Note that we will not use assertion (iii) of Theorem 3.2 in the proof of Lemma 3.1, while assertion (iv) of this theorem will be very important. By the definition of the weak s-necessity, there is a sequence of points \( z_n=(x_n,u_n,v^1_n,\ldots ,v^N_n)\), satisfying conditions (17)–(23). Without loss of generality, we assume that the Lyusternik condition holds at each point \( z_n\), since it always holds on an open set, and \(\Vert z_n-\hat{z}\Vert \rightarrow 0\) as \(n\rightarrow \infty \). According to Theorem 3.2, each point \( z_n\) can be “approximated” by a point \(\tilde{z}_n=(\tilde{x}_n,\tilde{u}_n,\tilde{v}_n^1,\ldots ,\tilde{v}_n^N)\) such that the norms \(\Vert \tilde{x}_n- x_n\Vert _\infty \) and \(\Vert \tilde{u}_n- u_n\Vert _\infty \) are so small that
- (a)
Conditions
$$\begin{aligned}&F_0(\tilde{\xi }_n)<F_0( \hat{\xi }),\quad F_i(\tilde{\xi }_n)< 0, \quad i=1,\ldots ,k, \end{aligned}$$(24)$$\begin{aligned}&\mathrm{ess\,sup}_{[0,1]}g(\tilde{u}_n)<0 \end{aligned}$$(25)hold for all \(n=1,2,\ldots \),
- (b)
The equality constraints of the associated problem are satisfied:
$$\begin{aligned} K(\tilde{\xi }_n)=0,\quad \dot{\tilde{x}}_n=f(\tilde{x}_n,\tilde{u}_n) +\sum _{i=1}^N \tilde{v}^i_n \Big (f(\tilde{x}_n,u^i)-f(\tilde{x}_n,\tilde{u}_n)\Big ) \end{aligned}$$(26)for all \(n=1,2,\ldots \),
- (c)
Each function \(\tilde{v}_n^i\) takes only two values, zero or one, \(i=1,\ldots ,N\), and the same is true for each sum \(\sum _{i=1}^N \tilde{v}_n^i\),
- (d)
\(\Vert \tilde{x}_n-\hat{x}\Vert _\infty \rightarrow 0\) as \(n\rightarrow \infty .\)
Set
Then, in view of condition (c),
and hence
Moreover, in view of (c), conditions (25) imply
Conditions (24), (26), (27), (28) together with condition (d) mean that the sequence \(\tilde{w}_n'=(\tilde{x}_n,\tilde{u}_n')\), \(n=1,2,\ldots \) violates the strong local minimum at \(\hat{w}\) in the main problem. The lemma is proved. \(\square \)
3.3 Second-Order Necessary Conditions for a Weak Local Minimum in the Associated Problem
Let \(\hat{w}=(\hat{x},\hat{u})\) be a strong local minimum in the main problem and \(a=(u^1,\ldots ,u^N)\in \mathcal A\). In the sequel, it will be convenient to supply the objects related to the associated problem with the superscript a. It follows from Lemma 3.1 that either \(\hat{z}= (\hat{x},\hat{u},0,\ldots ,0)\) is a point of a weak s-necessity in the associated problem, or the Lyusternik condition does not hold at this point in the associated problem. Then, applying Theorem 3.1 to the associated problem, we obtain the following result.
Lemma 3.2
Suppose that \(\hat{w}=(\hat{x},\hat{u})\) is a strong local minimum in the main problem. Then, for any \(a=(u^1,\ldots ,u^N)\in \mathcal A\), the following second-order necessary condition is satisfied at the point \(\hat{z}= (\hat{x},\hat{u},0,\ldots ,0)\) in the associated problem; the set \(\Lambda ^{a}\) is nonempty and
Let us describe the set \(\Lambda ^{a}\), the functional \(\Omega ^{a}( z,\lambda ^{a})\), and the critical cone \(\mathcal{C}^{a}\) at the point \(\hat{z}\) in the associated problem.
3.3.1 Set \(\Lambda ^{a}\)
Define the functions
Whereas \(\hat{v}^i=0 \), \(i=1,\ldots , N\), the set \(\Lambda ^{a}\) at the point \(\hat{z}\) in the associated problem consists of all tuples \(\lambda ^{a}:=(\alpha ,\beta ,p,\mu ,\eta ^1,\ldots ,\eta ^N,\zeta )\) such that
Let us analyze these conditions. Complementarity condition (37) and the conditions \(\hat{v}^1=\cdots =\hat{v}^N=0\) imply \(\zeta =0\). Then, taking into account that all \(\eta ^i\ge 0\), from conditions (41), we obtain
Thus, \(\lambda =(\alpha ,\beta ,p,\mu )\) is a tuple from \(\Lambda \), satisfying conditions (42). Moreover, it follows from (41) that
Conversely, if \(\lambda =(\alpha ,\beta ,p,\mu )\) is an arbitrary element of \(\Lambda \) satisfying conditions (42), then setting \(\zeta =0\), and defining \(\eta ^i\) by (43), we get the element \(\lambda ^{a}:=(\alpha ,\beta ,p,\mu ,\eta ^1,\ldots ,\eta ^N,\zeta ) \in \Lambda ^{a}\).
Denote by \(M^{a}\) the set of all \(\lambda =(\alpha ,\beta ,p,\mu )\in \Lambda \) satisfying conditions (42). We have proved the following lemma.
Lemma 3.3
The set \(\Lambda ^{a}\) consists of all tuples \(\lambda ^{a}:=(\alpha ,\beta ,p,\mu ,\eta ^1,\ldots ,\eta ^N,\zeta )\) such that \(\lambda =(\alpha ,\beta ,p,\mu )\in M^{a}\), the components \(\eta ^i\) are determined by conditions (43), and \(\zeta =0\).
3.3.2 Critical Cone \(\mathcal{C}^{a}\)
For \(a\in \mathcal A\) and the corresponding point \(\hat{z}\), let us describe the critical cone \(\mathcal{C}^{a}\) at the point \(\hat{z}\) in the associated problem. First, we write the equation in variations for differential Eq. (13) at \(\hat{z}\). Since \(\hat{v}^1=\cdots =\hat{v}^N=0\), we get
We also must take into account that the constraint \(\sum _{i=1}^N v^i -1 \le 0 \) is not active at the point \(\hat{z}\). Hence, \(\mathcal{C}^{a}\) consists of all tuples \( z=( x, u, v^1,\ldots , v^N)\) such that \( x\in W^{1,\infty }([0,1],\mathrm{I\! R}^{d(x)})\), \(u\in L^\infty ([0,1],\mathrm{I\! R}^{d(u)})\), \(v^i\in L^\infty ([0,1],\mathrm{I\! R})\), \(v^i\ge 0\), \(i=1,\ldots ,N\), Eq. (44) holds, and
Let \(\mathcal{C}_0^{a}\) be the subset of tuples \( z=( x, u, v^1,\ldots , v^N)\in \mathcal{C}^{a}\) such that \( v^1=0,\ldots , v^N=0.\) The following lemma is obvious.
Lemma 3.4
The projection
maps the cone \(\mathcal{C}^{a}_0\) onto the critical cone \(\mathcal{C}\).
3.3.3 Quadratic Form \(\Omega ^{a}\)
It can be easily verified that for any \(\lambda ^{a}\in \Lambda ^{a}\) and any z,
where \( w=( x, u)\), \(\xi =(x(0),x(1))\), and then the following lemma becomes obvious.
Lemma 3.5
If an element z satisfies \( v^1=0,\ldots , v^N=0,\) then
where \(\lambda \) is the projection of \(\lambda ^{a}\) under the mapping
and w is the projection of z under the mapping (45).
Let us return to Lemma 3.2. Since \(\mathcal{C}^{a}_0\subset C^{a}\), we can replace the critical cone \(C^{a}\) in condition (29) with the smaller cone \(\mathcal{C}^{a}_0\), and with this change, Lemma 3.2 remains valid. From here, taking into account Lemmas 3.3–3.5, we obtain the following result.
Theorem 3.3
If \(\hat{w}=(\hat{x},\hat{u})\) is a strong local minimum in the main problem, then, for any \(a\in \mathcal A\), the following second-order necessary condition holds in the main problem: the set \(M^{a}\) is nonempty and
Now, we can derive Theorem 2.2 from Theorem 3.3. This will be done in the next section.
3.4 Proof of Theorem 2.2
Assume that \(\hat{w}=(\hat{x},\hat{u})\) is a strong local minimum in the main problem. Then, by Theorem 3.3, for any \(a\in \mathcal A\), the set \(M^{a}\) is nonempty and condition (46) holds true. The set \(\mathcal A\) is directed by the inclusion: a is followed by \(a'\) if the set of controls of \(a'\) contains the set of controls of a; in this case, we write \(a \subset a'\); moreover, each two collections \(a_1\), \(a_2\) are followed by the third \(a_3\), whose set of controls is the union of the sets of controls of \(a_1\) and \(a_2\).
Now, consider the family of sets \(\{M^a\}_{a\in \mathcal {A}}\). This family is directed by the inverse inclusion: if \(a \subset a'\), then \(M^a\supset M^{a'}\). For any two collections \(a_1\) and \(a_2\) and a collection \(a_3\) such that \(a_1 \subset a_3\) and \(a_2 \subset a_3\), we have \(M^{a_1}\cap M^{a_2}\supset M^{a_3}.\) Clearly, each of the sets \(M^{a}\) is closed. Thus, \(\{M^a\}_{a\in \mathcal {A}}\) is a centered family of nonempty closed subsets of the finite-dimensional compact set \(\Lambda \), and hence the intersection
is nonempty. Since \(M^0\subset M^{a}\) for any \(a\in \mathcal A\), it follows from (42) that for any \(\lambda =(\alpha ,\beta ,p,\mu )\in M^0\) and any \(u(\cdot )\in L^\infty ([0,1],\mathrm{I\! R}^{d(u)})\) satisfying the control constraint \(u(t)\in U\) a.e. in [0, 1], we have
By the measurable selection theorem this implies the minimum condition (7) for the given \(\lambda \). Consequently, \(M^0\subset M\). The opposite inclusion is obvious. Hence, \(M^0= M\).
Let \(w\in \mathcal{C}\) be an arbitrary element. By virtue of Theorem 3.3, taking into account the compactness of \(M^a\)\(\forall \, a\in \mathcal A\), we get: for any \(a\in \mathcal A\), there is an element \(\lambda (a)\in M^a\) such that
Let \(\lambda \) be a limit point for the directedness \(\{\lambda (a)\}\). Since \(\lambda (a)\in M^a\)\(\forall \, a\in \mathcal A\), we obtain: \(\lambda \in M^a\)\(\forall \, a\in \mathcal A\), and hence \(\lambda \in M\). Passing to the limit in condition (47), we get: \(\Omega (w,\lambda )\ge 0\), and hence condition (9) of Theorem 2.2 holds. Theorem 2.2 is completely proved.
4 Conclusions
In this paper, we consider an optimal control problem in the Mayer form, with endpoint constraints of equality and inequality type and a control constraint, specified by a finite number of inequalities. We assume that all data are twice smooth and that the gradients of active control constraints are linearly independent at any point satisfying these constraints. We prove the necessary second-order condition for a strong local minimum for an arbitrary measurable and essentially bounded optimal control. No qualifying assumption is made regarding the control system and endpoint constraints. The proof method uses the transition from the main problem to the so-called associated problem, generated by a finite number of admissible controls. Using Dmitruk’s theorem, we show that for each finite collection of admissible controls, the strong local minimum in the main problem implies the necessary second-order condition for a weak local minimum in the associated problem. Then, analyzing the last conditions, we obtain the desired result.
The question arises: is it possible to prove a similar result for the case of control constraint given by an arbitrary compact set, using the first- and second-order tangents to this set and without using any qualification assumptions regarding the control system and endpoint constraints?
References
Levitin, E.S., Milyutin, A.A., Osmolovskii, N.P.: Conditions of high order for a local minimum in problems with constraints. Russ. Math. Surv. 33, 97–168 (1978)
Aronna, M.S., Bonnans, J.F., Dmitruk, A.V., Lotito, P.A.: Quadratic conditions for bang-singular extremals. Numer. Algebra Control Optim. 2, 511–548 (2012)
Hoehener, D.: Variational approach to second-order optimality conditions for control problems with pure state constraints. SIAM J. Control 50, 1139–1173 (2012)
Osmolovskii, N.P.: Second-order sufficient optimality conditions for control problems with linearly independent gradients of control constraints. ESAIM Control Optim. Calc. Var. 18, 452–482 (2012)
Osmolovskii, N.P.: On second-order necessary conditions for broken extremals. J. Optim. Theory Appl. 164, 379–406 (2015)
Osmolovskii, N.P.: Theory of higher-order conditions in optimal control Habilitation. Moscow Civil Engineering Institute, Moscow (1988). [in Russian]
Osmolovskii, N.P.: Necessary quadratic conditions of extremum for discontinuous controls in OC problem with mixed constraints. J. Math. Sci. 183, 435–576 (2012)
Osmolovskii, N.P.: Sufficient quadratic conditions of extremum for discontinuous controls in OC problem with mixed constraints. J. Math. Sci. 173, 1–106 (2011)
Frankowska, H., Osmolovskii, N.: Second-order necessary optimality conditions for the Mayer problem subject to a general control constraint. In: Bettiol, P., Cannarsa, P., Colombo, G., Motta, M., Rampazzo, F. (eds.) Analysis and Geometry in Control Theory and its Applications. Springer INDAM Series, vol. 12, pp. 171–207. Springer, Cham (2015)
Frankowska, H., Osmolovskii, N.: Second-order necessary conditions for a strong local minimum in a control problem with general control constraints. Appl. Math. Optim. 80, 135–164 (2019)
Frankowska, H., Osmolovskii, N.P.: Strong local minimizers in optimal control problems with state constraints: second-order necessary conditions. SIAM J. Control Optim. 56(3), 2353–2376 (2018)
Aubin, J.-P., Frankowska, H.: Set-Valued Analysis. Birkhäuser, Berlin (1990)
Ioffe, A.D.: On generalized Bolza problem and its application to dynamic optimization. J. Optim. Theory Appl. 182, 285–309 (2019). https://doi.org/10.1007/s10957-019-01485-z
Ioffe, A.D.: Towards the theory of strong minimum. A view from variational analysis. arXiv:1904.10647v2 [math.OC]. Accessed 25 June 2019
Osmolovskii, N.P.: Necessary second-order conditions for a weak local minimum in a problem with endpoint and control constraints. J. Math. Anal. Appl. 457, 1613–1633 (2018)
Dmitruk, A.V.: Approximation theorem for a nonlinear control system with sliding modes. Proc. Steklov Math. Inst. 256, 92–104 (2007)
Acknowledgements
I wish to express my gratitude to the anonymous referees for helpful remarks.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Osmolovskii, N.P. Necessary Second-Order Conditions for a Strong Local Minimum in a Problem with Endpoint and Control Constraints. J Optim Theory Appl 185, 1–16 (2020). https://doi.org/10.1007/s10957-020-01647-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10957-020-01647-4