| In recent years,the quadrotor UAV has been extensive attention and has become increasingly prevalent across military reconnaissance,logistics and transportation and other fields owing to its low costs,compact size and strong flexibility.However,the quadrotor UAV is a highly coupled under-actuated nonlinear system,and its attitude control problem has always been one of the research hotspots and difficulties in this field.Although classical PID control,sliding mode control,back-stepping control and other control algorithms can achieve good attitude control,they do not achieve optimal attitude control.Therefore,it is of great significance to study the optimal control algorithm based on reinforcement learning.The primary research objectives of this study are as follows:To begin with,the study analyzes the operational principles of the quadrotor UAV and selects two reference coordinate systems.Then,the rotation matrix of coordinate system conversion is determined.Subsequently,the mathematical model of the quadrotor UAV is established based on Newton’s second law and Euler equation,while also analyzing the key technologies associated with the control system.Secondly,the study determines a new quadrotor UAV mathematical model,taking into account the uncertain disturbances in actual flight.Then,the study constructs an augmented matrix,using attitude tracking error and the anticipated attitude based on the mathematical model of the quadrotor UAV.The study takes into consideration both the tracking error and energy consumption of the system,and subsequently designs a discounted performance index function.The function incorporates the upper bound of uncertain disturbances,as well as the overall performance.This approach effectively transforms the tracking problem into the optimal regulation of the nominal system,thereby enhancing the system’s overall robust.As the control strategy is not readily solvable using conventional methods,the study designs a robust optimal attitude controller for quadrotor UAVs based on reinforcement learning reward feedback reinforcement mechanism.The study designs a single critical neural network structure and an iterative updating rule of weights based on the reinforcement learning principle,and gets an estimated value for the optimal control strategy.Additionally,the study devises a weight-updating law to optimize the neural network’s performance.To establish the feasibility of the algorithm,the study employs the Lyapunov theory to prove that the system is uniformly ultimately bounded.The effectiveness of the algorithm is further corroborated through numerical simulation experiments.Subsequently,the study delves into the transient and steady-state performance control of the quadrotor UAV’s attitude,and a robust optimal attitude controller based on reinforcement learning and prescribed performance is designed.To facilitate attitude tracking control,the study designs a prescribed performance function that specifies the desired transient performance of the quadrotor UAV’s attitude,and the tracking error of the transient and steady-state performance of the attitude is specified,so that the transient and steady-state performance of the four-rotor is kept within the specified range.However,the inequality constrained tracking problem is not easy to solve.The study constructs a dynamic model of conversion error and converts into an equality unconstrained tracking problem.The study comprehensively considers the overall performance of the system and introduces a cost function that realize the optimal attitude control.The approximate optimal control quantity is obtained by using reinforcement learning technology.To establish the effectiveness of the proposed algorithm,the study employs Lyapunov theory to prove that the system is bounded.The study also conducts numerical simulations,which corroborate the effectiveness of the algorithm.Lastly,the study builds a hardware experimental platform,a virtual simulation platform,and a hardware-in-the-loop simulation experimental platform for the quadrotor UAV.The robust optimal attitude control program for the quadrotor UAV,which is based on the reinforcement learning algorithm,is implemented using the C++ programming language.Subsequently,the program is compiled into firmware and uploaded onto the main controller for the quadrotor UAV.The validation of the controller’s stability and dependabilit y has been confirmed through a series of experiments,including virtual simulations,hardware-in-the-loop simulations,and real flight tests. |