Font Size: a A A

Research And Application Of Incomplete Information Game Algorithm Based On Reinforcement Learning And Game Tree Search

Posted on:2021-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:J W LeiFull Text:PDF
GTID:2370330602978881Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Game problems exist in all aspects of people's daily lives.According to the degree of game information obtained by participants,game problems can be divided into complete information games and incomplete information games.In real life,it is often difficult for people to obtain all game information,so many game problems,such as business negotiation,advertising pricing,military deduction,network security,etc.can be attributed to incomplete information game.With the continuous development of artificial intelligence technology,the use of artificial intelligence technology to solve incomplete information games has become a research hotspot and has very important practical significance.This paper mainly studies the problem of mahjong machine game.In mahjong game,the opponent's hand information and the information in the wall are invisible to each participant,so mahjong is a typical incomplete information game.The previous mahjong program was mainly designed by Expectimax search algorithm.At present,the research on Expectimax search algorithm mainly focuses on two aspects.One is to study how to reasonably prune the branches of the search tree,and the other is to study how to design a reasonable evaluation function.However,on the mahjong game problem,the designs of the pruning strategy and evaluation function of the Expectimax search algorithm are only based on artificial prior knowledge,but not combined with the reinforcement learning algorithm.To solve this problem,this paper proposes a incomplete information game algorithm which combines Double DQN and Expectimax search.The algorithm uses the reinforcement learning model Double DQN to improve the pruning strategy and evaluation function of the Expectimax search algorithm.Specifically,when expanding the Expectimax search tree,this paper designs the pruning strategy and evaluation function through the valuation of Double DQN.When training the Double DQN model,this paper uses Expectimax search to improve the model's exploration strategy and reward function.Finally,this paper implements the intelligent decision-making system of Mahjong based on the improved algorithm.Compared with the mahjong program built by the traditional Expectimax algorithm,the Mahjong intelligent decision-making system implemented in this paper is 2.26%higher in win rate and 185.097 points higher in average scores per game,thus reaching a higher level of game decision.
Keywords/Search Tags:Double DQN, Expectimax Search, Incomplete Information Game, Mahjong Game, Reinforcement Learning
PDF Full Text Request
Related items