Research On Deep Reinforcement Learning Design Method For Multi-Constraint Guidance Contro

Posted on:2024-09-28

Degree:Master

Type:Thesis

Country:China

Candidate:D H Dou

Full Text:PDF

GTID:2532307142451454

Subject:Electronic information

Abstract/Summary:

PDF Full Text Request

This paper is based on the research of small-scale guided munition and employs the method of deep reinforcement learning(DRL)to investigate topics such as aerodynamic identification,controller design,and multi-constraint guidance.An adaptive genetic algorithm-back propagation neural network(AGA-BPNN)aerodynamic identification model has been designed by combining genetic algorithm and neural network,which realizes online acquisition of aerodynamic parameters while expanding the aerodynamic data.The model takes the angle of attack,Mach number,and rudder deflection angle as inputs and corresponding aerodynamic coefficients as outputs.The genetic algorithm adaptive adjusts the initial weight and threshold values of the neural network through crossbreeding and mutation probability to effectively avoid the identification results from falling into local optimization.After training,the derivative characteristics of the neural network are exploited to further identify the aerodynamic derivatives of the corresponding aerodynamic coefficients,which provide the basis for subsequent control system design and ballistic calculation.In response to the problem of the traditional linear autopilot design process being cumbersome and difficult to meet the requirements of full flight envelope performance,a two-loop autopilot based on Twin Delayed Deep Deterministic Policy Gradient(TD3)is proposed.The deep reinforcement learning model of the two-loop autopilot is constructed,with flight information entropy as the state and autopilot control parameters to be designed as the action.A reward function is designed to constrain the stability margin of the system.The TD3 algorithm is used for offline learning of the autopilot control parameters for the full flight envelope,resulting in a fitting model that can be directly applied to the guidance loop.The fitting model is verified online with the pitch constraint guidance problem,and simulation results show that the proposed two-loop autopilot can adjust the control parameters in real-time based on the flight state,ensuring attitude stability while achieving accurate acceleration command tracking.In response to the problem of multi-constraint guidance,this study utilizes the Proximal Policy Optimization(PPO)algorithm to design terminal angle-constrained DRL guidance control strategies based on both guided-control feedback loops and integrated guided-control systems.The Markov Decision-making Process(MDP)is constructed to consider the comprehensive dynamics of the missile body and the influence of control mechanisms.The real-time angle error is introduced into the state vector,and the normal acceleration and rudder deflection angle are limited.A rational reward function is designed to reduce the distance between the missile and the target while correcting the pitch angle error and addressing the issue of sparse rewards.A Beta distribution is used for policy sampling to eliminate the negative impact of unbounded distributions on bounded action spaces.Additionally,an entropy regularization term is introduced to encourage various action explorations.The designed pitch-angle-constrained DRL strategy is further augmented with a threshold switch control to implement visuo-constraint by correcting the guidance head’s angle.Simulation and Monte Carlo hitting tests in multiple scenarios demonstrate the effectiveness,versatility,and robustness of the proposed method.

Keywords/Search Tags:

Deep reinforcement learning, Autopilot, Multi-constraint guidance, Aerodynamic identification, Policy gradient

PDF Full Text Request

Related items

1	Research On Driverless Control Policy Based On Deep Reinforcement Learning
2	Design On Termianl Guidance Law Based On Reinforcement Learning
3	Research On Virtual Unmanned Vehicle Control Based On Deep Reinforcement Learning
4	Research On Lane Keeping Control Of Intelligent Vehicles Based On Deep Reinforcement Learning
5	Research Of Unmanned Driving Policy Based On Aggregated Multiple Deep Deterministic Policy Gradient
6	Research On Artificial-Intelligence Design Model For Railway Vertical Alignment Based On Deep Reinforcement Learning
7	Research On Deep Reinforcement Learning Algorithm For Intelligent Military Decision
8	Research On SDN Intelligent Routing Optimization Based On Deep Reinforcement Learning
9	Research On Unmanned Vehicle Control Method Based On Policy Gradient Reinforcement Learning
10	Research On Decision-making Method Of Highway Autonomous Driving Based On Reinforcement Learning