Font Size: a A A

Researches On Fuzzy And Model-free Optimal Control Based On Single-network Adaptive Dynamic Programming

Posted on:2015-09-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:J L ZhangFull Text:PDF
GTID:1108330482954544Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
As we all know, there exist many schemes to design stable controllers for con-trolled systems. However, stability is only a minimum requirement in system design. Optimal control guarantees the stability of the controlled system. However, optimal control of controlled systems is a difficult and challenging topic. Adaptive dynamic programming is a very useful tool in solving optimization and optimal control prob-lems by employing the Bellman principle. In particular, it can easily be applied to design the optimal control for nonlinear systems. Here, we first use a new func-tion approxiamtor (Fuzzy Hyperbolic Model, FHM/Generalized Fuzzy Hyperbolic Model, GFHM) to capture the mapping between the state and value function in-stead of neural networks, and design the optimal control for the nonlinear systems. The fuzzy approxiamtor has physical significance, so its structure can be decided more rationally by the knowledge from human expertise and experiments. Further, the new technology is used to solve the consensus optimal problem of multi-agent systems. In addition, the ADP researches for the completed model-free nonlinear systems and time-delay systems are few, so we give the results for nonlinear systems and linear time-delay systems by ADP algorithm, respectively.The main researches of the dissertation can be briefly described as follows:1. In the conventional ADP, to overcome the drawback of the neural-network structure, which does not have the physical significance, we present a fuzzy adaptive dynamic programming (FADP) method to design the optimal control for nonlinear systems. The FHM is used to capture the mapping between the state and the value function, instead of neural networks. To minimize the error of HJB equation resulting from the FHM, the gradient descent is used to obtain the optimal solution.2. The FHM does not have universal approximation, and it is only more effective to the value function which varies near the origin. While the GFHM can approximate any smooth function on the compact set, the approximation of the GFHM for the value function is effective globally. Therefore, the GFHM is used to approximate the solution of the HJB equation, instead of FHM. Further, the optimal control for nonlinear systems is designed. Furthermore, we give the stability analyse of the FADP method. The fuzzy adaptive dynamic programming can be generalized as the conventional ADP based on neural networks.3. The consensus optimal problem of the multi-agent systems is solved by the presented fuzzy adaptive dynamic programming, which brings together game theory, GFHM and ADP technology. By Bellman principle, the coupled Hamilton-Jacobi (HJ) equations are constructed. The game theory builds the bridge between Nash equilibrium and the solution of the HJ equations. Then, the GFHMs are used to obtain a set of approximate coupled HJ equations. Further, the equations are solved by fuzzy adaptive dynamic programming and policy iteration technologies. Finally, the stability is guaranteed by the proof based on Lyapunov theory, and the uniformly ultimately bounded (UUB) of the weight estimation error and the consensus error is proven.4. For the equations governing the system being unknown, an ADP algorithm is presented to design the optimal control by the measurements, without building or assuming a model for the system. To circumvent the requirement of the priori knowledge for systems, a precompensator is introduced to construct an augmented system. Further, the corresponding Hamilton-Jacobi-Bellman (HJB) equation is obtained by Bellman principle. To minimize the error of HJB equation resulting from neural networks, the least-squared technique is used to update the weights of neural networks. The main idea of the method is to sample the information of state, state derivative and input to update the weighs of neural networks in the least-squared sense. The update process is implemented in the framework of PI.5. The nearly data-based optimal control for linear discrete systems with delays is presented. The nearly optimal control can be obtained only using the measured input/output data from systems, by reinforcement learning technology, which combines Q-learning with value iterative algorithm. First, a state estimator is constructed by using the measured input/output data. Then, the state estimator is used to design the nearly optimal control for the linear discrete systems with delays by Q-learning and value iteration method.Finally, concluding remarks are given. Some unsolved problems and develop-ment direction for the approximate dynamic programming are proposed. Further- more, the prospects of the further study are given.
Keywords/Search Tags:Optimal control, fuzzy adaptive dynamic programming, rein- forcement learning, nonlinear systems, multi-agent system, (generalized) fuzzy hy- perbolic model, game theory
PDF Full Text Request
Related items