Research On The Design Of Agent-based Decision Model For Games Based On Reinforcement Learning

Posted on:2021-02-09

Degree:Master

Type:Thesis

Country:China

Candidate:X Wang

Full Text:PDF

GTID:2370330623968600

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

At present,most of the researches are conducted by using the value-based Q function such as DQN reinforcement learning algorithm,which reduces the research on the more intuitive reinforcement learning algorithm based on the strategy method.Moreover,in the field of games,the distribution of continuous high-dimensional state actions is a huge difficulty faced by the application of reinforcement learning in decision problems.To solve this problem,in this paper,the basic method strategy search method--deterministic strategy gradient algorithm is studied,the advantages and disadvantages of deterministic strategy gradient algorithm are analyzed,and its defects are improved.An improved model of double-shear strategy gradient algorithm is proposed,and the influence of different improved parts on experimental results is discussed.Finally,four consecutive high-dimensional tasks were selected for training on the game platform to prove the performance improvement level of the improved algorithm in solving this problem.This article mainly carries on the elaboration of five parts.(1)This paper first briefly introduces the essence of reinforcement learning and its development application field,then introduces the basic method--deep learning,expounds its development history and current situation,and finally gives a brief introduction to the development of DRL.(2)The second chapter analyzes the mathematical model of reinforcement learning architecture--markov decision process,and belman optimal solutions,and raises the reinforcement learning method,the basis of value iteration method and policy iteration method,and then analysis environment without model based on value iteration method and iterative method of two kinds of reinforcement learning strategies solving method,the monte carlo method and the temporal difference method.(3)In chapter 3,based on the strategy iteration and temporal difference method in the previous chapter,the deterministic strategy gradient algorithm of the basic method that needs to be improved is proposed.This paper analyzes the error caused by Q estimation network and the cumulative error of update,and puts forward three improvement measures: double shear Q learning,target network and delay strategy update,and target strategy smooth regularization.(4)The fourth chapter uses the game in GYM interface MuJoCo as the environment platform.And use the same environment and network structure to compare the performance of the algorithm with that of the same strategy iteration,and carries out a series of ablation experiments on different parts of the improved algorithm,compares the performance of the algorithm,and finally discusses the influence of the experimental results.(5)The fifth chapter summarizes the content of this paper,and further expounds the unresolved problems of deterministic strategy gradient algorithm,and puts forward the prospect of the improvement and application of this algorithm in the future.

Keywords/Search Tags:

Deep reinforcement learning, Strategy gradient, Double Clipped Network, The game intelligence

PDF Full Text Request

Related items

1	Research And Realization Of Game Strategy Based On Deep Reinforcement Learning
2	Research And Application Of Imperfect Game Strategy Based On UCT Algorithm And Deep Reinforcement Learning
3	Researches On Combinatorial Optimization Methods In Basis Of Deep Reinforcement Learning
4	Dynamic Optimization Design-based Deep Learning Model And Its Application On Eeg-based Analysis
5	Research On Complex Games Based On Deep Reinforcement Learnin
6	Research And Application Of Incomplete Information Game Algorithm Based On Reinforcement Learning And Game Tree Search
7	Research And Application Of Stock Quantitative Trading Based On Deep Reinforcement Learning
8	Research On Intelligent Cooperative Strategies Of UAV Swarm For Adversarial Tasks
9	An Empirical Study On Paired Trading Investment Strategy Based On Reinforcement Learning
10	Design And Implementation Of Game Decision-making Mechanism Based On Deep Reignfrocement Learning Algorithm