Object Focused Q-learning Algorithm Improvement Research

Posted on:2019-08-08

Degree:Master

Type:Thesis

Country:China

Candidate:Z H Chen

Full Text:PDF

GTID:2428330596460570

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

This dissertation focuses on the improvement Object Focused Q-learning algorithm.Reinforcement learning is one of major branches of machine learning.The Q-learning algorithm is a classical and fundamental algorithm in reinforcement learning.One of the disadvantages of the Q-learning algorithm is that it is ineffective in domains with large state space.The Object Focused Q-learning algorithm is one of algorithms which improve the Q-learning algorithm.It can work in a specific larger occasion.By classifying the objects in the domain artificially,the algorithm can decompose the state space of the domain,so the dimension of the state space can be reduced exponentially.As a result,the algorithm can get better results in a certain period of time.This dissertation aims to improve the Object Focused Q-learning algorithm in perspectives of stability and convergence speed.The main work of the dissertation is as follows:First,combining the Object Focused Q-learning algorithm with the model-based learning algorithm.The steps of the Prioritized Sweeping algorithm are added in the original algorithm.This improvement accelerates the convergence speed due to experiments.Second,the control strategy used in the original algorithm is changed.In the dissertation,the impact of different control strategies on convergence in different domains is tested.The improved control strategy makes the algorithm converge more stable.Third,improving the algorithm in fields of computing resource utilization.Enable the traditional value-based algorithms combine with asynchronous model-based learning.In this dissertation,we introduce a simple asynchronous model learning system framework.The framework was designed inspired strongly by Actor-Critic structure and distributed priority experience playback technology.The system enables the traditional reinforcement learning algorithm to use multithreading or multiprocessing to perform model learning asynchronously.More model information of the environment can be acquired for model learning during the equal number of episodes,so efficiency improved.In the case of only one actor,the system is the same as the traditional Prioritized Sweeping algorithm.In the case of more than one actor,the training performance has a stable increase.The improvement effect is related to the application domains and the number of actors.Finally,we combine the Object Focused Q-learning algorithm with the asynchronous model learning framework,and the convergence speed is faster than the original algorithm in different domains.

Keywords/Search Tags:

reinforcement learning, Object Focused Q-learning algorithm, model-based learning, asynchronous system

PDF Full Text Request

Related items

1	Object Focused Reinforcement Learning Algorithm Research
2	Research On Machine Learning Algorithms Based On Planning Network Model
3	Research On Object Detection Algorithm And Application Based On Deep Reinforcement Learning
4	Supervised Reinforcement Learning:methods And Applications
5	Research On Target Tracking Algorithm Based On Deep Learning And Reinforcement Learning
6	Dynamically Tiered Model-based Reinforcement Learning Algorithm Research
7	Research And Implementation Of Algorithm And Hyperparameter Recommendation Platform Based On Reinforcement Learning
8	Research And Implementation Of Chat Robot Algorithm Based On Reinforcement Learning
9	Research On Reinforcement Learning Based On Asynchronous Method
10	Research And Implementation Of Stock Quantitative Trading Algorithm Based On Deep Reinforcement Learning