Partial derivative with respect to the measure and its application to general controlled mean-field systems☆
Introduction
Let be a fixed time horizon and be a given complete probability space on which a -dimensional Brownian motion is defined. In this paper we are interested in studying the partial derivative of a function with respect to the law conditioned to its second marginal , where is from the space of probability measures over whose first marginal has a finite second order moment. In the second part, as an interesting application, we will study Peng’s maximum principle for a class of general optimal control problems with the McKean–Vlasov dynamics: where denotes the law of the random variable under , is the natural filtration generated by , and the coefficients are measurable functions with appropriate dimensions (The precise assumptions on them will be given in Section 4). The control process takes its values in an arbitrary measurable space . Here, we will not assume any differentiability of the coefficients with respect to the control , and neither with respect to the second marginal of the law of .
Stochastic control problems with controlled McKean–Vlasov dynamics have been used to describe the (Nash) equilibrium state of the symmetric game, see, for example, Chassagneux et al. [9]. In [8], with the help of a tailor-made stochastic maximum principle, the authors proved the existence of a mean-field game strategy. It is worth noting that their stochastic maximum principle is based on the assumption that the Hamiltonian is strictly convex with respect to the control , which plays an important role in the discussion.
The main difficulty in dealing with the stochastic maximum principle (SMP) without convexity of the control state space nor regularity assumptions of the coefficients with respect to the control variable is to find the variational equations and adjoint equations. Peng [17] was the first to get rid of the convexity assumption on the control state space by using the second-order term in the Taylor expansion and he proved the necessary condition of optimality for a control in the case where the diffusion coefficient depends on the control. On the other hand, since the pioneering works by J.M. Lasry and P.L. Lions [13] and Huang, Malhamé, Caines [11], the research on mean-field problems has attracted a lot of researchers. And so also mean-field stochastic control optimal problems have been studied by many authors; we refer to [1], [2], [3], [14] and references cited therein. Acciaio et al. [1] studied mean-field stochastic control problems where the cost functional and the state dynamics depend on the joint distribution of the controlled state and the control process. They proved the Pontryagin stochastic maximum principle with differentiability assumptions of the coefficients with respect to the control and its law. Buckdahn, Li and Ma [3] were the first to study the optimal control problem for a class of general mean-field stochastic differential equations with non-convex control domains and they obtained the related Peng’s stochastic maximum principle. We emphasize that in [3] the coefficients do not depend on the law of the control, that is, the cost functional and the state dynamics only depend on the law of the controlled state.
Strongly inspired by Buckdahn, Li and Ma [3], [4], our work investigates a generalized mean-field stochastic maximum principle for the optimal control problem where the coefficients not only depend on the law of but also with partial information. Similar to the works of [2], [3], [17], the second-order variational equation and the second-order adjoint equation are obtained without convexity assumption on the control state space. In our setting, the coefficients depend on the joint law of without assuming differentiability with respect to the law of control, which extends the existing works. As in the pioneering work of Peng [17] we do not require the differentiability of the coefficients in the control variable , and in order to be coherent, we neither suppose it with respect to the second marginal of the law of . In the existing literature the partial derivative of a function (the space of Borel probability measures with finite second order moment over ) has been introduced as the first component of the derivative . However, when, for instance, is a probability measure over , where is an arbitrary measurable space, or simply when is a control state space, such a global differentiability property is rather restrictive and shall be avoided. For this reason we study in Section 3 the partial differentiability with respect to the law conditioned to its second marginal without any assumption of regularity with respect to . Our results cover the particular cases that is differentiable over and , respectively, and an example of the partial differentiability of is discussed, when does not have any regularity with respect to the second marginal law.
In order to come back to our generalized mean-field stochastic maximum principle, in addition to the dependence of the coefficients of Eq. (1.1) of the joint law of , they also depend on partial information of the controlled state process, i.e., in our case, on the conditional expectation , which immediately leads to subtle difficulties in treating some appropriate estimates. A special case of our setting was studied in [2], where the dependence of the coefficients on is reduced to that on . It is very natural to study optimal control problems under partial information since controllers can only get partial information in most cases, see, for example, Huang, Wang and Xiong [12] obtained the stochastic maximum principle for control problems under partial information with the differentiability of the coefficients with respect to the control .
Our optimal control problem we study in this paper can be illustrated by the following motivating example:
Example 1.1 Let , be a sequence of independent 1-dimensional Brownian motions defined over a probability space . We consider the model of a financial market with a highly risky equity fund and a riskless asset with risk free rate . The price of a share of the equity fund at time is given by , where is the volatility and the return rate. Each of the investors holds a portfolio (portfolio of the th investor) which he tries to optimize. The value at time of the investor is whose sell price is perturbed by an independent Brownian motion with volatility : . The portfolios are supposed to be self-financing, which implies that the dynamics of is given by The th investors optimizes his portfolio by choosing the optimal process which is adapted with respect to the filtration generated by . His gain functional , depends of his choice but also on the investment processes , of the other investors and is given by where is a utility function satisfying suitable assumptions, indicating the utility of for the th investor, which he also measures in comparison with the average sell value observed at time , , denotes the empirical cumulative distribution function for , and is the left-inverse empirical cumulative distribution function at level ; can be interpreted at the empirical value at risk at level , ()). This empirical value at risk allows to describe the risk exposition of the th investor with comparing his investment process with that of the other investors
and can be regarded as the expected empirical shortfall at level (expected empirical CVaR). Thus, the gain function is the difference between the expected utility of the sell value at a finite time horizon and insurance prime for the risk run by the investor with his investment strategy, Each of the investors optimizes his gain functional, and since of the symmetry of the control problem with respect to the investors, one can assume that there is a measurable, non anticipating functional such that, for the optimal investment strategy of the th investor, . Moreover, when the number of investors tends to , one gets with the help of an adequate version of the law of large numbers the weak convergence, -a.s., for the regular conditional law knowing , for some limit investment strategy of the form , for some non anticipating, measurable functional , and for denoting the conditional cumulative distribution function of knowing , . Then, observing that , is continuous and bounded, and supposing, for simplicity, that , is bounded, using the conditional independence of and knowing , we have But, as tends to infinity, the latter expression converges to , i.e., as , where is the expected shortfall at level with respect to the regular conditional probability knowing . Hence, taking into account that, for the optimal controls, , -a.s., this leads to the limit control problem for a typical investor , when the number of investors is very large, endowed with the gain functional We observe here that is a non differentiable function of the law of . On the other hand, we see the particular role played here by the conditional law . Remarking that , where is the -field generated by , we see that the above example generalizes in a direct way to control problems of the type (1.1) endowed with a gain or cost functional of the form (4.2).
Our main result for the SMP for our control problem with dynamics (1.1) can be stated roughly as follows. We assume that is an optimal control and is the corresponding optimal controlled process. Then there exist two pairs of stochastic processes and , the solutions of the first-order and the second-order adjoint equations, respectively, such that, for all , -, Moreover, on the classical Wiener space with as coordinate process, if the coefficients , and are continuous with respect to in the -metric, for all , with a continuous modulus then, for all and , -, where the Hamiltonian is of the form .
The key point for the proof of the stochastic maximum principle is to show the second-order expansion of : For all , where denotes the state process corresponding to the spike variation of the optimal control : , and is a Borel subset such that the Borel measure ; are the solutions to the first and the second order variational equations, respectively, and stands for a remainder which tends in quicker to than .
Since the coefficients and depend not only on , but also on the joint law , there are some technical difficulties to prove the estimate (1.3). It is worth emphasizing that the very technical results of Proposition 5.2 and Lemma 5.2 with their rather subtle proofs play a crucial role in dealing with the study and the estimates in the Taylor expansion. In particular, in the proof of Proposition 5.2 which shows the specificity of stochastic controlled systems with mean-field dependence, we develop an operator argument, which is totally new and different from the classical case, see [3].
The paper is organized as follows. In Section 2, the notion of differentiability with respect to a probability measure is recalled. In Section 3 the partial derivative with respect to the law conditioned to its second marginal is studied, and the obtained results are illustrated by two examples. Section 4 is devoted to the formulation of the control problem and the SMP. The (first and second order) variational equations and some crucial estimates are introduced in Section 5. In Section 6 we prove our main result for the stochastic maximum principle.
Section snippets
Preliminaries
Let be a complete probability space and be a filtration satisfying the usual assumptions. For any sub--field , we denote
is the set of -valued, -measurable random variables with , which is a Hilbert space with inner product , .
is the set of -valued, -adapted processes on , such that
is the collection of all probability measures with finite second moment over
Partial derivative with respect to the law conditioned to its second marginal
Let be an arbitrary measurable space. By we denote the set of all probability measures over , and we put For , we denote by , and , the marginals of ; means that . This space can be endowed with the metric , where denotes the collection of
Problem formulation
Let be an arbitrary measurable space and a complete probability space such that equipped with its Borel -field is a Radon space, is the completion of with respect to , and suppose that on is defined a -dimensional Brownian motion , . Given an arbitrary but fixed finite time horizon , we denote by and the completed natural filtration generated by and , respectively. Let denote the completed
Variational equations
Now we introduce the first order and the second order variational equations. Since the control set is not necessarily convex, we shall use the spike variation method. More precisely, let , and choose a Borel subset with . For an arbitrarily chosen but fixed , we define which is called a spike variation of the optimal control .
The key point to prove the stochastic maximum principle stated in Theorem 4.1 is to show for the controlled
Proof of Theorem 4.1
From the definition of the cost functional , the optimality of and the definition of in (5.31) for as well as in Corollary 5.1, we obtain
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References (17)
- et al.
Mean-field backward stochastic differential equations and related partial differential equations
Stochastic Process. Appl.
(2009) Stochastic maximum principle in the mean-field controls
Automatica
(2012)Mean-field forward and backward SDEs with jumps and associated nonlocal quasi-linear integral-PDEs
Stochastic Process. Appl.
(2018)- et al.
Extended mean field control problems: stochastic maximum principle and transport perspective
SIAM J. Control Optim.
(2018) - et al.
A general stochastic maximum principle for SDEs of mean-field type
Appl. Math. Optim.
(2011) - et al.
A stochastic maximum principle for general mean-field systems
Appl. Math. Optim.
(2016) - et al.
A mean-field stochastic control problem with partial observations ann
Appl. Probab.
(2017) - et al.
Mean-field stochastic differential equations and associated PDEs
Ann. Probab.
(2017)
Cited by (6)
A stochastic maximum principle for partially observed general mean-field control problems with only weak solution
2023, Stochastic Processes and their ApplicationsStochastic maximum principle for weighted mean-field system
2023, Discrete and Continuous Dynamical Systems - Series SExtended mean-field control problem with partial observation
2022, ESAIM - Control, Optimisation and Calculus of VariationsON THE NEAR-VIABILITY PROPERTY OF CONTROLLED MEAN-FIELD FLOWS
2023, Numerical Algebra, Control and Optimization
- ☆
The work has been supported by the NSF of P.R. China (No. 12031009, 11871037), National Key R and D Program of China (NO. 2018YFA0703900), NSFC-RS (No. 11661130148; NA150344), and also supported by the “FMJH Program Gaspard Monge in optimization and operation research”, and the ANR (Agence Nationale de la Recherche), France project ANR-16-CE40-0015-01.