Font Size: a A A

Research On Decision Making Method Of Maritime Combat Simulation Based On Deep Reinforcement Learning

Posted on:2024-01-27Degree:MasterType:Thesis
Country:ChinaCandidate:L XiaFull Text:PDF
GTID:2542307127973469Subject:Ships and marine structures, design of manufacturing
Abstract/Summary:PDF Full Text Request
The progress and development of technology not only changes people’s way of life,but also has an impact on the military field.Advanced weapons and equipment have changed the traditional way of fighting,and put forward higher requirements for the speed of decision-making and the formulation of tactics on the battlefield.Artificial intelligence technology has made remarkable achievements in dealing with continuous complex decision-making tasks,providing better solutions to combat decision-making problems.The military field is embracing a new revolution.Deep reinforcement learning is the core of artificial intelligence,which enables computers to have powerful data expression ability and autonomous learning ability,and has excellent performance in games,robots and other fields.It is a key technology to solve decision-making problems.The introduction of deep reinforcement learning into the combat field will promote the development of military intelligence.In the process of deep reinforcement learning,the agent formulates and adjusts strategies according to the reward feedback from the environment.When it is difficult to obtain the reward from the environment,the learning of the agent will be very difficult,and there is a serious problem of sparse reward in the combat simulation environment.In order to achieve efficient and accurate intelligent decision making,this dissertation studies reinforcement learning algorithm applied in combat simulation environment by improving sparse reward problem.Firstly,the research background of intelligence in military field is introduced and the research status of deep reinforcement learning algorithm is reviewed.Then it briefly introduces the important concepts of reinforcement learning and analyzes the existing classical problems.Finally,two kinds of training methods are proposed to solve the sparse reward problem based on the sea red and blue antagonistic hypothesis.The main work contents are as follows:(1)The hierarchical idea is proposed to combine rules with reinforcement learning to train agents.Based on hierarchical reinforcement learning,the combat tasks are divided into upper and lower levels.The upper level controls the lower level.The reinforcement learning algorithm is trained to make the upper level decisions,and the lower level action details are completed by setting rules.Hierarchical processing and rule base setting can improve the sparse reward problem.Through simulation and deduction experiments,the effect of rule-based hierarchical reinforcement learning method is significantly improved,which is 3.5% higher than that of the agent guided by pure rules,and 2.1% higher than that of the agent trained by the reinforcement learning method alone.(2)A MAAC algorithm based on hindsight experience replay is proposed,which is called HER-MAAC algorithm for short.In view of the problems such as low sample utilization rate,sparse rewards and slow convergence rate in the process of multi-agent reinforcement learning,the target selected according to the hindsight experience replay strategy is recalculated reward value by using the failed exploration experience and stored in the replay buffer to increase the proportion of successful experiences in the replay buffer.Thus improving the efficiency of sample extraction.Experiments in a standard environment,Grid World,show that compared with the original algorithm,the success rate and reward value of agents are improved by the MAAC method combined with hindsight experience.(3)Based on the confrontation scenario between red and blue at sea,the state space and action space in this scenario are analyzed,and the red agent is trained by MAAC algorithm and HERMAAC algorithm respectively for comparison.The combat simulation experiment verifies that the HER-MAAC algorithm has better training effect and higher reward than the original algorithm,and the winning rate of the trained agent is increased by 1.5%.In this dissertation,the rule-based hierarchical reinforcement learning method and the HERMAAC algorithm improve the training efficiency of agents through expert experience and hindsight experience respectively,and provide a new way to solve the inherent sparse reward problem in reinforcement learning.The reinforcement learning method is applied to combat simulation system to improve combat decision-making ability and make an attempt for maritime combat intelligence.
Keywords/Search Tags:combat simulation deduction, reinforcement learning, sparse reward, hierarchical reinforcement learning, the HER-MAAC algorithm
PDF Full Text Request
Related items