Research On Differential Games Of Air Combat Based On Reinforcement Learning

Posted on:2020-06-06

Degree:Master

Type:Thesis

Country:China

Candidate:Y Zhao

Full Text:PDF

GTID:2370330605978943

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

Differential games with conflict and antagonism play important roles in the military field.Reinforcement learning has been widely concerned in the field of complex nonlinear systems and multi-agent because of its good learning performance.In this thesis,two reinforcement learning algorithms,Minimax Q-learning and fuzzy Q-learning,are used to solve a typical kind of differential game problem,aircraft pursuit-evasion game.Firstly,the difficulties of solving differential games and the problems faced by reinforcement learning are introduced,and the theories and main algorithms of reinforcement learning are explained.This thesis expounds the theory of differential games,establishes the differential game model of pursuit-evasion problem,describes the system by relative motion state of aircrafts,simplifies the state equations of pursuit-evasion model,and analyzes the symmetric relationship between the states of the system and the control quantities of both sides.Then,Minimax Q-learning is used to solve the control policies of pursuer and evader.The pursuit-evasion game is transformed into a zero-sum game,the reinforcement learning model is established based on simplified state equations.Under the condition that the pursuer knows the action of the evader at the moment,the symmetric relationship between the system states and the control quantities of both sides is used to improve the learning efficiency of Q-value,and the off-line Q-matrix obtained by Minimax Q-learning algorithm is used to be a guidance for both sides to choose their policy.Simulation results verify the feasibility of the method.Finally,the non-zero-sum fuzzy Q-learning model is established for each agent and the optimal control quantities of both sides are calculated.Fuzzy Q-learning can generate global continuous actions for agents in continuous time system to overcome the discontinuous of the control quantities in Minimax Q-learning.In practice,the actions of others cannot be observed at the moment.Therefore,the fuzzy Q-learning model of both sides is established and solved in this condition,and the control policies of both parties are calculated through off-line Q matrices.The simulation results show the effectiveness of the method,and the comparison with Minimax Q-learning shows the practicability of fuzzy Q-learning in continuous time system.

Keywords/Search Tags:

Differential Games, Pursuit-evasion Game, Reinforcement Learning, Minimax Q-Learning, Fuzzy Q-Learning

PDF Full Text Request

Related items

1	Some Problems Of Pursuit Evasion Differential Game With Integral Constraints
2	Research On The Design Of Agent-based Decision Model For Games Based On Reinforcement Learning
3	System Dynamics And Learning Theory In Games
4	Research On Complex Games Based On Deep Reinforcement Learnin
5	Reinforcement Learning-based Black-box Evasion Attacks To Link Prediction In Dynamic Graphs
6	Research And Realization Of Game Strategy Based On Deep Reinforcement Learning
7	Research And Application Of Reward Strategies For Reinforcement Learning In Incomplete Information Games
8	Research And Realization Of Complete Information Game Theory Based On Reinforcement Learning
9	Research On Multi-agent Reinforcement Learning Algorithm And Its Equilibrium Realization Path Under Repeated Game
10	Research And Application Of Incomplete Information Game Algorithm Based On Reinforcement Learning And Game Tree Search