Research And Realization Of Complete Information Game Theory Based On Reinforcement Learning

Posted on:2022-03-27

Degree:Master

Type:Thesis

Country:China

Candidate:Z Y Zhang

Full Text:PDF

GTID:2480306605471804

Subject:Master of Engineering

Abstract/Summary:

PDF Full Text Request

As a product of the combination of modern computer technology and Game theory,computer games have become one of the most popular research directions in the field of artificial intelligence.Complete information game has become the main research direction of computer game because of its wide applicability.In recent years,with the development of artificial intelligence technology,deep learning,reinforcement learning and other algorithms have been widely used to solve all kinds of complete information game problems.In particular,the combination of neural network training and reinforcement learning algorithms has evolved various artificial intelligence algorithms for game games,such as Go,which is one of the representatives of the complete information game.A series of Alpha Go Go versions developed by the Deep Mind team have successively defeated the world's top players such as Lee Sedol and Ke Jie,pushing AI to a new stage of development.The results of complete information games such as Go can also be applied to other fields,such as the deduction of chess,intelligent decision-making systems,financial decision-making,and robot control.The Gomoku Game,as another typical representative of a complete information game,has become the second most popular chess game in the world only to chess because of its simple and easy-to-understand rules.It has the typical significance of agglomeration game,at the same time it is easy to carry out research and has the advantages of the intelligence level and speed of rapid response model training.Therefore,based on the theory of reinforcement learning and complete information game,we use Gomoku Game as a model to conduct indepth research on the evolutionary speed of its training and the diversity of the game record.The main research contents and innovations are as follows:(1)We combine the algorithm in deep learning and the Monte Carlo tree search based on ordinary computer hardware level conditions,which is in reinforcement,learning to train the Gomoku model.And optimize for the shortcomings of existing methods,for example,we set a little value as the initial value for the probability distribution of the unplaced position which is calculated by the policy network,and use different chess strategies according to the number of steps in a game.We analyze the changes in the loss value of the network trained by this method,the changes in the level of the model,and the evolution of the movement strategy of the game record to verify the feasibility of the optimization method.We finally get a Gomoku AI model that is stronger than ordinary people.(2)Aiming at the low efficiency of Monte Carlo tree search in multithreading,we propose a Monte Carlo tree search method based on the mechanism of falling leaves.That is,from the beginning of a certain generation to the end of the self-game of the generation,we let each thread inherit only one search tree,and drops nodes with lower value information at an appropriate time according to the total number of nodes in the tree,thereby increasing the search depth and width of the search tree.Through experimental analysis,we know that the algorithm can effectively improve the evolution speed of the network without increasing the training time,and can make the Gomoku AI model has a self-learning effect in the process of self-playing,enriching the diversity of the game record and reducing the occurrence of the over-fitting phenomenon.(3)We propose a dual-chain parallel neural network structure,aiming at the bottleneck problem of Gomoku AI model.The shared network part of the original network is changed from a single-chain serial to a double-chain parallel structure.We know through experiments that the dual-chain model has a faster evolution speed,and adding dual-chain can effectively reduce the phenomenon of over-fitting when the model's chess level is in a bottleneck and enhance the evolutionary ability of the model.

Keywords/Search Tags:

Complete information game, Reinforcement learning, Gomoku game, Monte Carlo Tree Search, Falling leaves mechanism, A dual-chain parallel

PDF Full Text Request

Related items

1	Research And Application Of Incomplete Information Game Algorithm Based On Reinforcement Learning And Game Tree Search
2	Reserach On Imperfect Information Game Strategy Based On Ficitious Self-Play
3	Research On Inplementation Of Artifacial Intelligence Algorithm In Texas Holdem Based On Monte Carlo Tree Search
4	Research And Application Of Incomplete Information Game Decision Based On Game Tree And Deep Learning
5	Research And Application Of Imperfect Information Game Decision Based On Knowledge And Game-tree Search
6	Research On Knowledge Graph Completion Model Combining Temporal Convolutional Network And Monte Carlo Tree Search
7	Research And Application Of Imperfect Game Strategy Based On UCT Algorithm And Deep Reinforcement Learning
8	Application Of GPU Parallel Algorithm In The Solution Of Problems In The Game Theory Of Economics
9	Optimization Algorithms In Stochastic Model Predictive Control Based On Monte Carlo Methods
10	Game Coloring Of Some Particular Planar Graphs