Font Size: a A A

Research On Texas Poker Game Based On Counterfactual Regret Minimization Algorithm

Posted on:2016-05-20Degree:MasterType:Thesis
Country:ChinaCandidate:W J TengFull Text:PDF
GTID:2348330503486909Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Texas Hold’em Poker, as a typical representative of the imperfect information game, is a worldwide popular game. Dealing the cards randomly leads to the non-deterministic of information; opponents’ hand information is invisble to players,which leads to the imperfect information; Four round of betting actions make the game repeating, and so on. These characteristics determine its game tree has an exponential level of game states. Even the simplest form of Texas Hold’em Poker,two-player limit hold’em, the whole game tree consists of 3.19 × 1017 game states.Counterfactual Regret Minimization is the current state-of-the-art approach to computing capable strategy profiles for these large extensive-form games. CFR algorithm compute the counterfactual regret and average stategy of each information set’s action in the game tree, the minimization counterfactual regret and average stategy can be used to predict the action of next time. There are two challenges for established CFR to handle games at this scale: memory and computation. During computation, CFR must store the resulting solution and the accumulated regret values for each information set. The topic aims at the above research questions, has made improvements on CFR algorithm, and has realized a Texas Hold’em game system with a higher level of intelligence.The Texas Hold’em game system that this research has realized uses the method of 9-Bucketing hole cards abstraction to reduce the complexity of Texas Hold’em Poker’s state space. Based on the sampling thought of Monte Carlo CFR and the integer arithmetic thought of Pure CFR, two improved CFR algorithms are proposed.According to the different methods of handling opponent’s strategies when offline training and online game, this research call the two improved CFR as offline CFR based on opponent strategy and online CFR based on opponent strategy. These two kinds of improved algorithms not only has improved the computation efficiency of the algorithm and the game’s odds, but also has reduced the storage requirement,compared with the original CFR algorithm. This research based on the optimal strategy that has been produced by the two improved CFR algorithms to select the actions for future, combined with the bluffing strategy, to avoid being modeled by opponents. The 3-kuhn poker game agent based on CFR algorithm, which has been realized by this research, has gained a bronze medal at 2014 annual computer poker competition, which is held by AAAI conference. Then, the Texas Hold’em game system which has been realized by two kinds of improved CFR algorithms in this research, has won the games when it played with the ACPC agents in compared experiments.In daily life, there are lots of decisions have to be made when the information is unclear or unsure. This is similar to the decision-making process of Texas Hold’em Poker and other imperfect information games. CFR algorithm update strategy to make regret minimization, which can be used to solve many practical problems,such as finding the optimal strategy in auction houses and the negotiating table,mastering stock game system, etc.
Keywords/Search Tags:CFR, Texas Hold’em poker, imperfect information game
PDF Full Text Request
Related items