Font Size: a A A

Construction And Implementation Of Self-playing Go System In Supercomputer Cluster

Posted on:2020-05-22Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y WangFull Text:PDF
GTID:2428330572472318Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Go,with a history of thousands of years,due to its astronomical state space and decision space,the number of boards that computers can figure out in an acceptable time frame is very small.Therefore,before 2016,humans generally believed that computer Go would be difficult to defeat professional players and was considered the most complicated intellectual game.The development of computer Go has also evolved over the years.From the early mini-max algorithm to the Monte Carlo search algorithm,and then,to the deep learning method of alphaGo and AlphaGo Zero,the level of computer Go has finally been improved to the top-level.This thesis mainly explains how to transplant the model to the Sunway TaihuLight Supercomputer as the "AlphaGo Zero" reinforcement learning mode,so that it can be run in the CPU supercomputer cluster of Sunway without human's intervention.The mode has also been verified in the actual operation,and will be introduced,analyzed and summarized in this paper for the specific operation process and operation results.In the research work of this paper,the core work is mainly divided into three parts:1.Design and implement the overall process of reinforcement learning according to the characteristics of Sunway TaihuLight Supercomputer 2.Discuss and study the principle of Monte Carlo Tree Search algorithm,and try to optimize it without affecting the effectiveness 3.Run the complete process,and make adjustments and improvements according to the problems encountered in actual operation.This paper will research and discuss the project,focusing on these three parts.In the implementation of this project,there are two major difficulties.On the one hand,Sunway supercomputer cluster only has a large amount of CPU computing resources,and the computing speed is much slower than TPU and GPU on deep learning computing.On the other hand,due to the particularity of the Sunway's structure,the Monte Carlo Tree Search algorithm written in the "AlphaGo Zero" paper is not applicable to this project.This problem also needs to be explored in this project.
Keywords/Search Tags:computer go, artificial intelligence, reinforcement learning, monte-carlo tree search, parallel computing
PDF Full Text Request
Related items