The Research Of Evaluation Method In Connect6 Based On BP-TD Learning

Posted on:2010-03-05

Degree:Master

Type:Thesis

Country:China

Candidate:X X Li

Full Text:PDF

GTID:2218330371499529

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Both space search ability and evaluation function are the most important factors to measure the level of game playing. Connect6 has the simple rules, however, it has complex state spaces and large average branches of game tree, which limits the max depth of searching in the game tree, and makes the evaluation more important. Evaluation is one of the most difficult problems to tackle in game playing, the accuracy of evaluation usually determine the quality of the strategy for the next move.Because of the particularity of connect6, in order to make TD learning more efficient, this thesis proposes a two-steps move selected strategy:the first periodic policy is to allocate the weights for alternative moves based on their evaluations and the degree of network confidence, the next move is selected by using a probabilistic approach, moves with higher weight values are assigned higher probabilities, but every move is assigned a nonzero probability. The second periodic policy takes minimax tree search algorithm to select the next move. The combination of these two policies makes TDConn6 have the strategy of exploitation and exploration.Because of the particularity of connect6, in order to make TD learning more efficient, this thesis proposes a two-steps move selected strategy:the first periodic policy is to allocate the weights for alternative moves based on their evaluations and the degree of network confidence, the next move is selected by using a probabilistic approach, moves with higher weight values are assigned higher probabilities, but every move is assigned a nonzero probability. The second periodic policy takes minimax tree search algorithm to select the next move. The combination of these two policies makes TDConn6 have the strategy of exploitation and exploration.Taking the above-mentioned method and strategy, TDConn6 is implemented in this thesis, it learns from 'zero knowledge', and plays 1000 times with NEUConn6 and NEU6Star respectively after trained 30000 times, and the results are 64.7% and 80.5%, which prove that the method and the two-steps strategy are effective and practical.

Keywords/Search Tags:

Computer Game Of Connect6, Evaluation Function, TD Algorithm, BP Neural Network, Two-steps move selected strategy

PDF Full Text Request

Related items

1	Research And Realization Of The Computer Game And System Of Connect6
2	Research Of Key Technologies Of Connection-Pattern Based Computer Connect6
3	Research And Optimization Of The Computer-game System Of Connect6
4	Research And Implementation Of Computer Game Strategy Based On Reinforcement Learning
5	Research About The Key Technologies Of The Connect6 Computer Game
6	Searching Optimization Of Connection-Pattern Based Computer Game
7	The Research And Implementation Of Computer Games Which Based On The Alpha-Beta Algorithm
8	The Research And Application On The Behavior Eelection Of Connect6 Based On Intelligent Algorithms
9	Research On Move Prediction In Go Based On Convolutional Neural Networks
10	Research And Application Of Computer Game In The Game 2048