Font Size: a A A

Linear Quadratic Forward-Backward Stochastic Mean Field Games

Posted on:2017-05-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:S J WangFull Text:PDF
GTID:1109330485479605Subject:Financial mathematics and financial engineering
Abstract/Summary:PDF Full Text Request
Game theory is the study of strategic decision making. Generally speaking, it is the study of mathematical models of cooperation and conflict among intelligent ratio-nal decision-makers. It is mainly used in economics, political science, psychology, logic, computer science, biology, etc. The subject first addresses zero-sum games, in which one person’s gains exactly equal net losses of the other participant or participants. Today, however, game theory applies to a wide range of behavioral relations, and has developed into an umbrella term for the logical side of decision science. In many social, economic and engineering models, the individuals or agents involved have conflicting objectives. Therefore it is more appropriate to consider the optimization problem based upon in-dividual payoffs or costs. In this case, it gives rise to noncooperative game theoretic-approaches partly based upon the vast corpus of relevant work within economics, social sciences, etc.In the literature, studies of stochastic dynamic games and team problems may be traced to the 1960s (see e.g.,[1,2,3,4]). The optimal control context weakly interconnect-ed systems were studied in [5], and in a two player noncooperative nonlinear dynamic games setting Nash equilibria were analyzed in [6]. In recent years, the controlled s-tochastic large-population (also called multi-agent) system is evidently of importance due to its wide range of appearance in the areas, such as political science, economics, engineering, etc. Afterwards, the dynamic optimization or control of this kind of sys-tem has attracted consistent and intense attentions by research communities. The most special feature of controlled large-population system lies in the existence of consider-able insignificant agents who are individually negligible but their collective behaviors will impose some significant impact on all agents. This feature can be captured by the weakly-coupling structure in the individual dynamics and (or) cost functionals via the state-average across the whole population. In this way, the individual behaviors of all agents in micro-scale, can be connected to their mass effects in the macro-scale. This kind of weak-coupling in both dynamics and costs is used to model the mutual impact of agents during competitive decision-making. In particular, the dynamic coupling specifies the impact of the environment on an individual’s decision-making, and the underlying model takes the form of weakly coupled diffusion subject to individual controls.It is remarkable that the classical strategies by consolidating all agent’s exact states, turn out to be infeasible and ineffective due to the highly complicated coupling structure in large-population system. Alternatively, it is more tractable and effective to study the related strategies by considering its own individual state and some off-line quantities only. For large-population stochastic dynamic games with MF couplings, Nash certainty equivalence theory was originally developed in a series of papers by Huang togeter with Caines and Malhame. The optimization of large-scale linear control systems wherein many agents are coupled with each other via their individual dynamics and the costs are in an "individual to the mass" form, was presented in [7]. Then a general formulation of nonlinear McKean-Vlasov Markov process models was developed in [8,9,10].This thesis mainly focuses on the study of large-population system in its LQ case where the state equations are linear in the state with nonhomogeneous terms, and the cost functional are quadratic. Recall the linear system and its related LQ control have already been extensively investigated. Such a control problem is called a linear quadratic optimal control problem. Readers can refer to [11] for some classic results of deterministic LQ problems. For the stochastic case, the problems were addressed in [12,13]. One systematic introduction of stochastic LQ optimal control problem can be found in the monograph [14] and the references therein. Other related literature includes [15,16,17], etc. Due to the nice structure of LQ problem, there is also rich literature on large-population problem modeled by LQ system. LQ games in large-population systems where the agents evolve according to nonuniform dynamics were considered and an ε-Nash equilibrium property was proved in [18]. In [19], the author solved an Hamilton-Jacobi-Bellman and Kolmogorov-Fokker-Plank equations and found explicit Nash equilibria in the form of linear feedbacks. [20] aimed to study a class of LQ control problems with N decision makers, where the basic objective is to minimize a social cost as the sum of N individual costs containing MF coupling. Later on, [21] provided a comprehensive study of a general class of MF games in the LQ framework. For more literature about LQ problem with large-population, see [22,23,24] etc.As a new branch of game theory, MFGs arises from variety of areas, such as particle physics, economics, biology, etc. In many situations of particle physics, it is possible to construct an excellent approximation to the situation by introducing one or more "mean fields" that serve as mediators for describing inter-particle interactions. In this kind of model, one describes the contribution of each particle to the creation of a mean field and the effect of the mean field on each particle, by conceiving each particle as infinitesimal, i.e. by carrying out a kind of limit process on the number N of particles (N→+∞). In game theory, from a mathematical standpoint it involves of studying the limit of a large class of N-player games when N tends to infinity. Usually, differential games with N-players turn out to be untractable. Fortunately things are simplified, as least for a wide range of games that are symmetrical as far as players are concerned, as the number of players increases. Indeed, interindividual complex strategies can no longer be implemented by the players, for each player is progressively lost in the crowd in the eyes of other players when the number of players increases.During the last few decades, there is a growing literature to the study of MFGs and their applications. For this class of game problems a closely related approach was independently developed in [25,26,27]. Based on these, considerable research attention has been drawn along this research line. Some recent literature include [28,29,30,31] for recent progress in MFG theory. Introductions and some models to MFGs were given in [28]. In [29], the authors provided a complete probabilistic analysis of a large class of stochastic differential games with MF interactions. [30] was devoted to discussing and comparing two investigation methods of the asymptotic regime of stochastic differential games with a finite number of players as the number of players tends to the infinity. In addition, in [31], a model of inter-bank borrowing and lending was proposed, and systemic risk was analyzed.MF type control has also been extensively studied recently. In [32], the authors obtained MF BSDEs associated with a MF SDE as a limit of a high dimensional system of forward and backward SDEs, corresponding to a large number of agents. Later on, [33] deepened the investigation of such MF BSDEs with general coefficients and presented the related partial differential equations. Based upon these, [34] and [35] independently studied the optimal control of a SDE of MF type when the action space is convex, which was a partial result of [36]. Moreover, [37] provided an existence result for the solution of fully coupled FBSDEs of the MF type. For more literature about MF games and controls, see [38,39,40,41,42,43,44,45,46,47,48,49,50], etc. It is worth pointing out that there are differences between MFG and MF type control. Generally speaking, as addressed in [29,30,31], the MFG and MF type control are essentially different in their methods applied and the equilibriums derived. To be precise, the method of MFG will be "asynchronous"-styled. It will first fix or freeze the state-average x(N)t (in its linear case) or empirical measure μxt (in its nonlinear case) to reduce the initial problem into some standard problem but parameterized by such frozen term. Note that such frozen term is still undetermined in this step. Next, this parameterized standard problem can be solved and the optimal state can be obtained. This can be called the decentralized control. With this in hand, the frozen state-average or empirical measure can be further determined through some fixed-point analysis and consistency condition concerning the obtained optimality system. In this sense, the state-average (or, empirical measure) and the underlying control in MFG will change "asynchronously". By contrast, in MF type control problems, the state-average or empirical measure will not be fixed or frozen beforehand. Actually, they will change depending on the underlying control applied. In this way, the state-average term and state itself will be treated in "synchronous" style.It is remarkable that all agents in above literature are comparably negligible in that they are not able to affect the whole population in separable manner. By contrast, their impacts are imposed in a unified manner through the population state-average. In this sense, all agents can be viewed as peers. One real example is the market price formation in which there are considerable producers or firms to produce the same type product. Each firm is so small thus its individual production behavior can’t affect its peers’states. However, the average production of all firms will determine the market price of this product. All small firms are price-takers thus they are further interacted and coupled via such price formation mechanism. The above discussion assumes all agents are equally participating in the market price formation. However, in reality we point out the status and roles of agents may illustrate significant differences in various realistic situations. For instance, the decision making of small individuals are always influenced by some "leading" agent or "dominated" institutions. In our price formation example, such "leading" agent can be interpreted as some monopoly firm which takes considerable production capacity thus imposes more significant affects to the price formation. As to the "dominated" institution, it can be viewed as the local government as its industrial policy will greatly affect the production behaviors of all firms. Conversely, the small firms will also affect the local government through the market price. One channel is the production tax revenue, an important factor to calibrate the local government’s state, will depend on the formulated market price.The above discussion suggests the so-called major-minor agent models. To be more precise, let us figure out the following oil production example. In the crude oil explo- ration, each individual oil production company aims to explore more oil and thus pursue more profits. In this sense, their production plans always intend to take less account of some macro-factors such as the limited oil resources, the possible environmental costs as well as the long-term benefits in their exploration. On the other hand, these factors are mainly the concerns of relevant supervisory department or local government. Unlike the individual oil company, they are more concerned about the factors such as sustainable development, and the overall benefit of oil sector. Thus, they will always execute some macro-control policy by assuming the responsibility of major agent. All small firms (as the minor agents) should obey the policies when the production plan is making. Conse-quently, the set of all individual small producers consists of our minor-agent part, and it is further coupled with the local government (the major agent) via their state-average. The major-minor large-population system and related MFGs are extensively studied. Looking back to previous work, [51] discussed large-population systems with major and minor players by analyzing the case in an infinite set where the minor players are from a total of K classes. Later on, [52] considered a LQ problem with major and minor players by directly treating the mean field z in the population limit as a random pro-cess with random coefficients. Recently, [53] studied large-population dynamic games involving nonlinear stochastic dynamical systems with a major agent and a population of N minor agents and derived the eN-Nash equilibrium property where ∈N= O(1/(?)). In addition, [54] derived a game problem in a weak formulation; this means in partic-ular that the game was of the type "feedback control against feedback control". Then payoff/cost functional was defined through a controlled BSDE, for which the driving coefficient was assumed to satisfy strict concavity-convexity with respect to the control parameters.In most control problems, the information is assumed to be completely observed. However, it may be not reasonable in reality. Since the agents have different roles, po-sitions, methods, etc, the observations are different. It turns out that various stochastic control problems fit into the partial information framework due to the factors such as finite datum, latent process or noisy observation, etc. An extensive review of stochastic control with partial information was provided in [55]. There is other rich literature on partially observed stochastic control systems (see e.g. [56,57,58,59,60,61,62,63] for previous work and [64,65,66,67,68,69,70,71,72,73] for recent work). For the liter-ature on partially observed stochastic games, please refer to [74,75,76] and references therein. Remark that a class of LQMFGs with noisy observations was also addressed in [77] but defined on an infinite-time horizon so the algebra Riccati equations were involved. Moreover, the limiting state-average in [77] was deterministic as there was no common noise, which was rather different with the relevant problem in this thesis.It is very important that in all above works involving large-population system, all agents’states are formulated by (forward) SDEs with the initial conditions as a priori. As a sequel, the agents’ objectives are minimizations of cost functionals involving their terminal states. As the BSDE are well-defined stochastic systems with broad-range ap-plications, it is very natural to study its dynamic optimization in large-population setup. Indeed, the dynamic optimization of backward large-population system is inspired by a variety of scenarios. For example, the dynamic economic models for which the par-ticipants are of some recursive utilities or nonlinear expectations, or some production planning problems with some tracking terminal objectives but affected by the market price via production average. Another example arises from the risk management when considering the relative or comparable criteria based on the average performance of all other peers through the whole sector. This is the case for a given pension fund to e-valuate its own performance by setting the average performance (say, average hedging cost or initial deposit, surplus) as its benchmark. In addition, the controlled forward large-population systems, which are subjected to some terminal constraints, can be re-formulated by some backward large-population systems, as motivated by [78]. Different to SDE, the terminal instead of initial condition of BSDE should be specified as the priori. As a consequence, the BSDE will admit one adapted solution pair (yt, zt) where the second solution component zt is naturally presented here due to the martingale representation and the adaptiveness requirement. It is remarkable that there exist rich literature concerning the theories and applications of BSDE. The linear BSDEs were first introduced by [79]. In 1990, Pardoux-Peng [80] first introduced nonlinear BSDEs and es-tablished the theorem of existence and uniqueness under the stand Lipschitz condition. Based on this pioneering work, theory of BSDE has been developed rapidly in many fields, such as mathematical finance, partial differential equation, stochastic control and differentia] game, functional analysis, etc. Independently, [81] presented a stochastic dif-ferential recursive utility, which is a generalization of a standard additive utility with an instantaneous utility depending not only on an instantaneous consumption rate, but also on a future utility. As found by [82], the utility process can be regarded as a solution of a special BSDE. [82] also gave the formulations of recursive utilities and their properties from the point of view of BSDE. A BSDE coupled with a SDE in their terminal condition formulates the FBSDE. The forward-backward large-population dynamic optimization problems arise naturally in many practical situations. A typical situation is from the large-population system with constrained terminal condition (see e.g., [83]). In this sit-uation, the standard forward stochastic control problem can be well approximated by some forward-backward stochastic control problem.In the last few decades, FBSDE has been well studied. There are several famous results about the wellposedness. The method of contraction mapping was first used by [84] and later detailed by [85]. It works well when the duration T is relatively small. Another method called the "four step scheme" ([86]) was the first solution method that removed restriction on the time duration for Markovian FBSDEs. The third is the method of continuation. This is a method that can treat non-Markovian FBSDEs with arbitrary duration, initiated by [87] and [88], and later developed by [89] and [90]. Please refer to the book [91] for the detailed accounts for all the three methods. Recently, in [92] the authors find a unified scheme which combines all existing methodology, and overcome some fundamental difficulties that have been longstanding problems for non-Markovian FBSDEs. For more theoretical and practical results on FBSDE, please refer to [93,94,95,96,97,98,99,100,101,102,103,104,105,106] and references therein. Due to the interdependence of the states, FBSDE can be divided into two kinds:the partially-coupled FBSDE and fully-coupled FBSDE. The former means that the backward state component yt (or the forward xt) depends explicitly on the forward xt (or the backward state yt), but xt (yt) doesn’t explicitly depend on yt (xt), which is more accepted to represent the recursive utility or nonlinear expectation (see, e.g. [83,68,41]). In fact, the forward state xt usually represents the dynamics of some underlying asset, the backward state yt stands for the nonlinear expectation or recursive utility of decision maker. Thus it is reasonable and natural that the recursively utility will depend on the underlying state. But conversely, the forward underlying state will not be affected by the recursive utility adopted. In addition, mathematically, there’s great value to formulate and study the fully-coupled FBSDE (namely, the forward and backward states are interdependent with each other).Different with the game problems driven by FBSDEs above, many stochastic opti-mal control problems affected by some other mechanisms have extensive applications in practice, such as impulse control, time delay, regime-switching system, ect. The stochas-tic maximum principle, which was to investigate the optimal control, plays an important role extremely in theoretical research and practical application. Maximum principle-the necessary condition of optimal control was first formulated and studied by Pontryagin et al.’s group [107] in the 1950s and 1960s. Bismut [79] introduced the linear BSDEs as the adjoint equations, which played a role of milestone in the development of this theory. As Pardoux-Peng [80] established the theory of nonlinear BSDE, the general stochastic maximum principle was obtained by Peng in [108] by introducing the second order adjoint equations. Afterwards, Peng [109] first studied the stochastic maximum principle for optimal control problems of forward-backward control system as the control domain is convex. Since BSDEs and FBSDEs involve in a broad range of applications in mathematical finance, economics and so on, it is natural to study the control problems involving FBSDEs. Rich literature for stochastic maximum principle has been obtained, see [94,110,111,68,112] and the references therein. Recently, Wu [?] established the general maximum principle for optimal controls of forward-backward stochastic systems in which the control domains were non-convex and forward diffusion coefficients explic-itly depended on control variables. It greatly promotes the development of maximum principle of general forward-backward stochastic systems.With the great development of stochastic control theory, stochastic impulse control problems have received considerable research attention due to their wide applications in portfolio optimization problems with transaction costs (see [113,114]) and optimal strategy of exchange rates between different currencies ([115,116]). Korn [117] also investigated some applications of impulse control in mathematical finance. For a com-prehensive survey of theory of impulse controls, one is referred to [118]. Wu and Zhang [103] first studied stochastic optimal control problems of forward-backward systems in-volving impulse controls, in which they assumed the domain of the regular controls was convex and obtained both the maximum principle and sufficient optimality conditions. Later on, in [119] they considered the forward-backward system in which the domain of regular controls was not necessarily convex and the control variable did not enter the diffusion coefficient.The applications of regime-switching models in finance and stochastic control also have been researched in recent years. Compared to the traditional system based on the diffusion processes, it is more meaningful from the empirical point of view. Specifically, it modulates the system with a continuous-time finite-state Markov chain with each state represents a regime of the system or a level of economic indicator. Based on the switching diffusion model, much work has been done in the fields of option pricing, portfolio management, risk management and so on. In [120], Crepey focused on the pricing equations in finance. Crepey and Matoussi [121] investigated the reflected BSDEs with Markov chains. For the controlled problem with regime-switching model, Donnelly studied the sufficient maximum principle in [122]. Using the results about BSDEs with Markov chains in [120,121], Tao and Wu [123] derived the maximum principle for the forward-backward regime-switching model. Moreover, in [124] the weak convergence of BSDEs with regime switching was studied.In addition, the problem for delayed stochastic system has already been found many real backgrounds, such as economics, finance, management, engineering, decision sciences (see Arriojas et al. [125], Mohammed [126,127]). The main reason is that many phe-nomena, in these fields, are characteristic of past-dependence, i.e., their behaviors at time t depends not only on the situation at t, but also on their past history. Such mod-els may be identified as stochastic differential delayed equations (SDDEs). Nevertheless, the delayed system is difficult to be processed due to the delayed responses, not only for the infinite dimensional problem, but also for absence of Ito’s formula to deal with the delayed part of the trajectory. In order to surmount these difficulties, one can consider specific classes of systems with aftereffect, see e.g.(?)ksendal and Sulem [128].Motivated by above research, this thesis mainly focus on two classes of maximum principle of forward-backward stochastic optimal control problems with impulse control, one of which is forward-backward regime-switching system with impulse, the other is forward-backward delay system with impulse. In the first class, the control system is described by a forward-backward stochastic differential equation, all the coefficients contain Markov chains. This case is more complicated than [123] and [103,119]. In the second class, the control system is described by a forward-backward stochastic differential delay equation (FBSDDE), the control variables consist of regular and impulsive parts, both with time delay. It is well known that there is a perfect duality between SDEs and BSDEs. Peng and Yang [129] introduced a new type of BSDEs called anticipated BSDEs (ABSDEs) and established a duality between SDDEs and ABSDEs. By virtue of the theory of ABSDEs and dual method, Chen and Wu [130] first obtained a necessary maximum principle for the controlled system involving delays in both the state variable and the control variable. And then, Yu [131] obtained the stochastic maximum principle for the delayed controlled system involving impulse controls, in which the dynamics was modeled by SDDE and the domain of the regular controls was convex. For more literature about delay problem, please refer [132,133,134] and the references therein.The thesis is organized as follows:The large-population dynamic optimization in forward-backward setting is formulat-ed and the related LQMFGs for partially-coupled FBSDEs are investigated in Chapter one. For the individual agent, the optimal control of auxiliary track system is studied. The decoupling procedure of the Hamiltonian system (due to its two-dimensional fea-ture) will involve four Riccati equations and two ordinary differential equations (ODEs). Moreover, the decentralized control strategies are derived from the consistency condition and approximation scheme. In addition, the ε-Nash equilibrium property of our original problem is also verified based on some FBSDE estimates.We further study the backward LQ games of large-population systems in Chapter two, in which the individual states follow some BSDEs. Due to the existence of BSDE, this feature makes the setting very different to the existing works of LQMFGs wherein the individual states evolve by some SDEs. The individual state dynamics are weakly coupled through the state-average and the full information structure is assumed. The explicit form of the limiting process and ε-Nash equilibrium of our original problem are investigated.The large-population system in major-minor framework is considered in Chapter three, in which the major agent’s dynamics is characterized by some BSDE with pre-scribed terminal condition while the minor agents’ dynamics are governed by SDEs with prescribed initial condition. In this way, the major agent’s objective turns to minimize the cost functional depending on initial state and the minor agents want to minimize the cost functionals depending on terminal states. In addition, the problem when major player takes into account the relative performance by comparison to minor players is un-der consideration. The related LQMFGs are discussed and the decentralized strategies are derived. A stochastic process which relates to the state of major player is introduced here to be the approximation of the state-average process. An auxiliary MF SDE and a 3 x 2 FBSDE system are considered and analyzed. Here, the 3×2 FBSDE is com-posed by three forward and three backward equations. With the help of the monotonic method in [88] and [104], the wellposedness of this FBSDE is obtained. Finally, the ε-Nash equilibrium property of our original problem is derived with ε= O(1/(?)).Chapter four is devoted to the dynamic optimizations of large-population systems with partial information structure. Here, the individual agents can only access the filtration generated by one observable component of underlying Brownian motion. The state-average limit in this setup turns out to be some stochastic process driven by the common Brownian motion. Two classes of MFGs are proposed in this framework:one is governed by forward dynamics, and the other involves the backward one. In the forward case, the associated MFG and some Riccati equation system are formulated. In the backward case, the explicit forms of the decentralized strategies and some BSDE (satisfied by the limiting process) are obtained. In both cases, the ε-Nash equilibrium properties are presented.Chapter five focus on two classes of maximum principle of forward-backward s- tochastic optimal control problems with impulse control. The first class is forward-backward regime-switching system with impulse. We obtain the stochastic maximum principle by using spike variation on the regular control and convex perturbation on the impulsive one. Applying the maximum principle to a financial investment-consumption model, we also get the optimal consumption processes and analyze the effects to con-sumptions by various economic factors. The second class is forward-backward delay sys-tem with impulse. The distinguishing technique feature is to analyze the dual relation between SDDEs and ABSDEs and then obtain some delicate estimates for anticipated forward backward stochastic differential delay equations and corresponding variation-al equations. With some additional convex assumptions, we also prove the sufficien-t optimality conditions which have potential applications in financial portfolio-choice problems.
Keywords/Search Tags:linear quadratic, mean field games, large-population system, forward-backward stochastic differential equations, major-minor problem, partial information, stochastic maximum principle, impulse control
PDF Full Text Request
Related items