Construction And Implementation Of Self-playing Go System In Supercomputer Cluster

Posted on:2020-05-22

Degree:Master

Type:Thesis

Country:China

Candidate:Z Y Wang

Full Text:PDF

GTID:2428330572472318

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Go,with a history of thousands of years,due to its astronomical state space and decision space,the number of boards that computers can figure out in an acceptable time frame is very small.Therefore,before 2016,humans generally believed that computer Go would be difficult to defeat professional players and was considered the most complicated intellectual game.The development of computer Go has also evolved over the years.From the early mini-max algorithm to the Monte Carlo search algorithm,and then,to the deep learning method of alphaGo and AlphaGo Zero,the level of computer Go has finally been improved to the top-level.This thesis mainly explains how to transplant the model to the Sunway TaihuLight Supercomputer as the "AlphaGo Zero" reinforcement learning mode,so that it can be run in the CPU supercomputer cluster of Sunway without human's intervention.The mode has also been verified in the actual operation,and will be introduced,analyzed and summarized in this paper for the specific operation process and operation results.In the research work of this paper,the core work is mainly divided into three parts:1.Design and implement the overall process of reinforcement learning according to the characteristics of Sunway TaihuLight Supercomputer 2.Discuss and study the principle of Monte Carlo Tree Search algorithm,and try to optimize it without affecting the effectiveness 3.Run the complete process,and make adjustments and improvements according to the problems encountered in actual operation.This paper will research and discuss the project,focusing on these three parts.In the implementation of this project,there are two major difficulties.On the one hand,Sunway supercomputer cluster only has a large amount of CPU computing resources,and the computing speed is much slower than TPU and GPU on deep learning computing.On the other hand,due to the particularity of the Sunway's structure,the Monte Carlo Tree Search algorithm written in the "AlphaGo Zero" paper is not applicable to this project.This problem also needs to be explored in this project.

Keywords/Search Tags:

computer go, artificial intelligence, reinforcement learning, monte-carlo tree search, parallel computing

PDF Full Text Request

Related items

1	Improvement Of Monte Carlo Tree Search Algorithm In Two-person Game Problem
2	Computer Go and Monte Carlo Tree Search: Opening Book and Parallel Solutions
3	Research On The Computer Game Key Technology And System Design Based On Checkers Complete Information
4	Computer Poker Based On Monte-Carlo Tree Search
5	The Research On Symbolic Regression Based On Reinforcement Learning
6	Monte Carlo Tree Search For "Dou Di Zhu"
7	Research Of Computer Go Based On Expert System And Monte Carlo Method
8	Computer Go Game Research Based On Monte Carlo Tree Search
9	GPU High-Performance Computing Applied Research And Experimental In Computer GO
10	Research On Anomaly Localization Of Multi-Dimensional Monitoring Indicators In Artificial Intelligence IT Operations