In recent years,the design of hypersonic vehicle control system has become a research hotspot in the field of aerospace science,which has important research value in both military and civil fields.However,the traditional aircraft control method is strongly dependent on the internal dynamic information of the aircraft system.The aircraft strong nonlinear and not easy to establish mathematical model,so this paper focuses on the intelligent control method based on reinforcement learning,avoiding the strong dependence on the model,enabling the aircraft to adapt to the actual mission requirements in real time and update the flight strategy of the aircraft online,so as to meet the mission requirements of complex environment,flexible combat and intelligent control.The main contents of this paper are as follows:Firstly,we propose a data-driven algorithm.The weak model control design method partially relies on the internal dynamics model information of the aircraft system.Then,the PI algorithm is given.After that,we use ACTOR-CRITIC neural network to implement the policy iteration algorithm.The two networks are updated synchronously online,and finally can get the optimal weights of the two-neural network,which allows us to calculate the optimal value by the optimal weights.Secondly,in view of the weak model algorithm that requires part of the dynamics information of the aircraft system,a new ACTOR-CRITIC network structure is proposed to realize the policy iteration,which is a completely data driven algorithm,we call it model-free algorithm.This new algorithm allows us to get the optimal control without knowing the exact mathematical model.The ACTOR network input data include system state.the CRITIC network input data include system state and control signal data.Then,the simulation results are presented to verify the proposed algorithm can achieve stable control in the case of model uncertainty and noise.Finally,the trajectory simulation is performed on a high-speed aircraft trajectory model system given by the NASA.First,velocity system and altitude system can be obtained from the aircraft system.Then,we analyze the relationship between the tracking problem and stable problem.If the dynamic information of the system can not be obtained,it is not easy to solve HJB equation.But,the policy iteration also works without the dynamic information.We use the aircraft state information to update the neural network.Besides,a new technology is applied to the adjustment of CRITIC network to accelerate the convergence of the system.Finally,we completed the tracking control on MATLAB,and the aircraft other states converge to reasonable values. |