Reserach On Multi-player Imperfect Information Game Strategy Based On Ficitious Self-play

Posted on:2019-05-10

Degree:Master

Type:Thesis

Country:China

Candidate:J B Mao

Full Text:PDF

GTID:2428330590473919

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Machine game is a hot and challenging research direction in the field of artificial intelligence,and has been widely concerned by the academic community.In recent years,research on machine games has produced a number of eye-catching research results,such as the AlphaGo,a Go game that beats top Goplayers,and the Libratus.At present,the technology of machine gaming is used in the resolution of many practical problems,such as power dispatching,traffic control,and recommendation systems.According to the completeness of the game information,the game is divided into perfect information game and imperfect information game.Many decision problems in reality can be abstracted into the strategy optimization problem of imperfect information game.However,the strategy optimization algorithm of incomplete information,such as the Libratus,can only solve the two-person,discrete-action,simple-state game problem.Can not be applied well in solving real-world decision problems.Therefore,it is of great theoretical and practical significance to study multi-person incomplete information strategy optimization algorithms that support continuous motion and complex state.Based on the Fictitious self-play,combined with deep learning and multi-agent reinforcement learning,this paper uses Texas Hold'em and multi-agent particle environment as the experimental platform to study the multi-agent incomplete information machine game strategy optimization method.In the traditional method to solve the imperfect information game problem of Texas Hold'em,it is necessary to use the field of card abstraction to reduce the scale of the game tree,and the mobility is poor.This topic introduces the algorithm framework of Fictitious self-play,which divides Texas Poker strategy optimization into two parts: learning of optimal response strategy and learning of average strategy.It is realized by imitation learning and deep reinforcement learning respectively.A general optimal strategy learning method.On the issue of two-person Texas Hold'em strategy optimization,this topic uses the multi-category logistic regression method based on neural network and reservoir sampling to learn the average strategy,and uses the deep Q network to learn the optimal response strategy.The agent can rely on domain knowledge without relying on domain knowledge.Under the premise,the performance is similar to the traditional iterative algorithm;on the multi-player Texas Poker strategy optimization problem,the multiagent actor critic algorithm is introduced to learn the optimal response strategy,so that the value network can observe all the states and thus reduce the valuation.The deviations alleviate the instability of traditional reinforcement learning algorithms in multi-agent environments.At the same time,aiming at the influence of Bad Update on the optimization and optimization of multi-agent strategy in the process of optimization strategy,this paper proposes the optimization of multi-agent near-end strategy based on the idea of near-end strategy optimization.The algorithm can guarantee that each update can monotonously enhance the agent strategy.In the experiment,the algorithm can achieve similar or better performance than other current advanced reinforcement learning algorithms.

Keywords/Search Tags:

imperfect information game, fictitious self-play, multi-agent reinforcement learning

PDF Full Text Request

Related items

1	Research On Game Algorithm Based On Fictitious Self-play With Prioritized Experience Replay
2	Research On Multi-player Imperfect Information Computer Game Based On NFSP And ISMCTS
3	Research On Imperfect Information Machine Game Based On Deep Reinforcement Learning In 3D Game
4	Research On Imperfect Information Machine Game Based On Deep Reinforcement Learning
5	Research On Game Algorithm Of Imperfect Information 3D Video Game Based On Deep Reinforcement Learning
6	A Multi-agent Reinforcement Learning Algorithm Based On Stackelberg Game
7	Study On Machine Learning And Its Applications In Multi-Agent Game Learning
8	Study On Game-agent Based On Reinforcement Learning
9	Multi-Agent Reinforcement Learning Theory And Its Application In The Price Reporting System Of Electric Power Industry
10	A Multi-agent Reinforcement Learning Algorithm Based On Sparse Interactions