Application Of Deep Reinforcement Learning In Video Game Playing

Posted on:2016-01-24

Degree:Master

Type:Thesis

Country:China

Candidate:L W Qiu

Full Text:PDF

GTID:2308330479494718

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

How to let agents learn directly from high-dimensional sensory inputs like speech and vision is one of the long-standing challenges of reinforcement learning(RL) domains. Many successful RL applicationsbased on hand-crafted features combined with linear valuefunctions or policy representations.Obviously, the performance of those systems gravely relies on the quality of the feature representation.Now it is possible to extract high-level features from raw sensory data which is high dimensional because of the advances in deep learning(DL), which leading to breakthroughs in computer vision and speech recognition.These methods use a series of neural network architectures, including convolutional networks,multilayer perceptrons, restricted Boltzmann machines and recurrent neural networks. Also these methods have exploited both supervised and unsupervised learning. It let people curious about whether such techniques could also be beneficial for RL domain.However reinforcement learning presents several challenges from a deep learning perspective. Firstly, most successful deep learning applications have required large amounts of hand-labelled training data. On the other hand, RL algorithms must be able to learn from a scalar rewardsignal that is usually sparse, noisy and delayed. Another issue is that most deeplearning algorithms assume the data samples to be independent, while in reinforcement learning onetypically encounters sequences of highly correlated states. Furthermore, in RL the data distribution changes as the algorithm learns new behaviours, which can be problematic for deep learningmethods that assume a fixed underlying distribution.This paper tries to use the following methodtoovercome these problems. Firstly, we design a type of deep neural network architecture according to our task, and this network can learn successfully control policies from raw video data in complex RL environments. Also,we use model combination method which combines eight types of model that use difference architecture of deep neural network to steady model’s action policy and improve model’s performance. Furthermore, the network istrained with an improved Q-learning algorithm that use a special sample method that samples training data from a lot of historical experiences,and it use mini-batch L-BFGS to update the weights.Experiments shows that, after training with the improved Q-Learning method, RL model which combines deep neural network can exact high level feature and cansuccessful learn control policies with a steady manner. The performance on the video game of this model compared to traditional RL model and NFQ model has significantly improved and the game score of our model in four out of six test game outperformance human player. Game performance after model combinationing were better than the performance of a single model.

Keywords/Search Tags:

Reinforement Learning, Deep Learning, Model Combination, Video Game

PDF Full Text Request

Related items

1	Research On Game Algorithm Of Imperfect Information 3D Video Game Based On Deep Reinforcement Learning
2	Research On Video Game Simulation Algorithms Based On Deep Reinforcement Learning
3	Research And Application Of Decision-making Model For Video Games Based On Deep Reinforcement Learning
4	Research On Imperfect Information Machine Game Based On Deep Reinforcement Learning In 3D Game
5	Research And Design Of Online Public Opinion Text Orientation Analysis System Based On Deep Learning
6	Combination Of Spatial-temporal Domain Based Video Deraining Method
7	Research On Chess Game Based On Deep Reinforcement Learning
8	Video Target Detection Based On Deep Learning And Their System Implementation
9	A Research Of Deep Reinforcement Learning Algorithms In Combination With Multi-relations
10	Research On Recognition Of Fake Reviews Based On Deep Learning