Font Size: a A A

Design And Implementation Of Gobang Algorithm Based On Monte Carlo Tree And Neural Network

Posted on:2022-04-26Degree:MasterType:Thesis
Country:ChinaCandidate:X Y ShenFull Text:PDF
GTID:2507306548466194Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
A long-held goal of reinforcement learning has been to create an algorithm that can learn in challenging areas beyond human proficiency.Game theory is a representative algorithm of optimal strategy in artificial intelligence.Many artificial intelligence researchers apply game theory to various board games.Alpha Go used the general principles of game theory to succeed in Go by means of reinforcement learning,making reinforcement learning one of the hottest areas of machine learning today.Gobang is one of the traditional black and white chess with typical strategy.The game is complex and changeable,and has a large branch factor.Therefore,it is very difficult to find a winning strategy in a short time frame with limited computing resources.In this paper,aiming at the situation of big branching factors and complex and changeable chess game,we study the algorithm of gobang based on Monte Carlo tree search and deep neural network,and use Monte Carlo tree search to simulate gobang game,which is completely trained by self-playing reinforcement learning.On this basis,an intelligent gobang algorithm based on improved monte carlo tree search and deep neural network is proposed,which can improve the training speed of the model while maintaining the model performance.The main work of this paper is as follows:(1)In order to solve the problem that searching nodes of each layer are too large,this paper uses Monte Carlo tree to search and simulate the result of gobang game.The Monte Carlo tree search will select the best next move based on the current game state of gobang.When the Monte Carlo tree is stopped by computational force or time search,the next step is to make a decision based on the collected statistics,finding the optimal or suboptimal gobang placement without exhausting all the combinations.(2)With the deepening of network layers and the increase of parameters,as well as the problems of gradient explosion,gradient disappearance and network performance degradation existing in the traditional convolutional neural network,this paper uses the backgammon strategy valuation deep neural network composed of multiple residual blocks to evaluate the position of falling pieces and sample movement in the backgammon game.The residual structure forms a cross-layer connection,which makes the network stable and easier to train.Using the deep neural network with residual structure to guide the Monte Carlo tree search,the intensity of tree search is improved,and the quality of falling parts is higher,and the self-playing iteration is stronger.(3)In this paper,the definition of uncertainty of searching monte carlo tree is put forward for some gobang board states which do not need a long search time to find the best place,and the search is terminated when the stable best place is found.In this paper,three kinds of deep neural networks are used to predict the uncertainty of Monte Carlo tree search.By combining the improved Monte Carlo tree search with deep neural network,the training speed of the gobang algorithm model is improved,while maintaining the similar winning rate.Finally,the Monte Carlo tree search algorithm proposed before and after the improvement and the Gobang algorithm proposed by deep neural network are tested and evaluated respectively.By comparing the evaluation index ELO rating,the effectiveness of the proposed model is verified from the experimental results.
Keywords/Search Tags:Gobang, Monte Carlo tree search, Deep neural network, Residual block, Uncertainty, Elo rating
PDF Full Text Request
Related items