Font Size: a A A

Object Focused Q-learning Algorithm Improvement Research

Posted on:2019-08-08Degree:MasterType:Thesis
Country:ChinaCandidate:Z H ChenFull Text:PDF
GTID:2428330596460570Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
This dissertation focuses on the improvement Object Focused Q-learning algorithm.Reinforcement learning is one of major branches of machine learning.The Q-learning algorithm is a classical and fundamental algorithm in reinforcement learning.One of the disadvantages of the Q-learning algorithm is that it is ineffective in domains with large state space.The Object Focused Q-learning algorithm is one of algorithms which improve the Q-learning algorithm.It can work in a specific larger occasion.By classifying the objects in the domain artificially,the algorithm can decompose the state space of the domain,so the dimension of the state space can be reduced exponentially.As a result,the algorithm can get better results in a certain period of time.This dissertation aims to improve the Object Focused Q-learning algorithm in perspectives of stability and convergence speed.The main work of the dissertation is as follows:First,combining the Object Focused Q-learning algorithm with the model-based learning algorithm.The steps of the Prioritized Sweeping algorithm are added in the original algorithm.This improvement accelerates the convergence speed due to experiments.Second,the control strategy used in the original algorithm is changed.In the dissertation,the impact of different control strategies on convergence in different domains is tested.The improved control strategy makes the algorithm converge more stable.Third,improving the algorithm in fields of computing resource utilization.Enable the traditional value-based algorithms combine with asynchronous model-based learning.In this dissertation,we introduce a simple asynchronous model learning system framework.The framework was designed inspired strongly by Actor-Critic structure and distributed priority experience playback technology.The system enables the traditional reinforcement learning algorithm to use multithreading or multiprocessing to perform model learning asynchronously.More model information of the environment can be acquired for model learning during the equal number of episodes,so efficiency improved.In the case of only one actor,the system is the same as the traditional Prioritized Sweeping algorithm.In the case of more than one actor,the training performance has a stable increase.The improvement effect is related to the application domains and the number of actors.Finally,we combine the Object Focused Q-learning algorithm with the asynchronous model learning framework,and the convergence speed is faster than the original algorithm in different domains.
Keywords/Search Tags:reinforcement learning, Object Focused Q-learning algorithm, model-based learning, asynchronous system
PDF Full Text Request
Related items