Font Size: a A A

Source Task Selection For Transfer Learning In Reinforcement Learning

Posted on:2019-12-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:J H SongFull Text:PDF
GTID:1368330572465073Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Reinforcement learning(RL)is an important machine learning technology for solv-ing sequential decision-making problems.After decades of development,it has been successfully applied to many fields such as automatic control,robot,recommendation systems and information retrieval.Recent research in using transfer learning method-s to solve RL tasks has shown that knowledge learned from one source task may be reused to solve a similar target task better.However,when a source task is dissimilar to the target task,it will lead to negative transfer.But few studies have focused on how to avoid negative transfer.Most of the existing transfer learning methods assume that similar source tasks have been selected by humans;only few methods choose the most similar source tasks based on task similarity measures,but these methods often have strict preconditions;moreover,there is no well-defined method that can determine when negative transfer may occur.This paper studies how to choose the appropriate source task and proposes corresponding solutions from different perspectives to avoid negative transfer.The main contributions can be summarized as follows:1.We propose two novel metrics for measuring the distance between two Markov decision processes(MDPs)based on their whole models.Specifically,(1)the two metrics are based on the distance between states.We define the notion of homogeneous MDPs and propose a method for computing the distances between states in different MDPs.(2)After computing all the distances between states,we apply the Kantorovich Metric and the Hausdorff Metric,respectively,to com-posite them to compute the distance between two MDPs.The two metrics can be used to select the appropriate source tasks according to the distances between tasks for transfer learning in reinforcement learning,and we propose two transfer methods which transfer value functions of selected source tasks to the target task.Experimental results on a benchmark show that our metrics are effective in find-ing similar tasks to avoid negative transfer in transfer learning,and significantly improve the performance of the baseline algorithms with our transfer methods.2.We propose a method,which employs a deep neural network to identify the pos-itive/negative transfer performance of a pair of transfer learning tasks,to solve the problem "to transfer or not to transfer" for video RL tasks.The task descrip-tions of video RL tasks can be represented as images,and the relateness between such tasks can be reflected in the images.Specifically,we formalise the transfer performance prediction problem as a binary classification problem,and adopt a siamese convolutional neural network(CNN)to learn task features from the im-ages and a fully connected network to predict whether transfer is useful between a pair of tasks.The experimental results on benchmarks show that our method can accurately predict the transfer performance and significantly outperform the baseline methods as well as a method which is most related to ours.3.To construct an appropriate curricula for curriculum learning,we propose new methods to construct task sequences for transfer learning based on automatic source task creation and task similarity measures.The main contributions are as follows:(1)We propose three operators to modify the target task to gen-erate source task sets based on the object-oriented representation of RL tasks.(2)For tasks modified by different operators,the corresponding task similarity measures are proposed.These measures are based on the object-oriented repre-sentation and are defined according to the similarities and differences between objects,states and etc.We also define transfer potential considering both simi-larity between tasks and difficulty of tasks.(3)We propose automatic curriculum construction methods based on the transfer potential measures.Experimental re-sults on a benchmark show that the proposed methods can construct good task sequences,significantly improve the learning speed in the target task,and are superior to the latest existing method.
Keywords/Search Tags:Source Task Selection, Transfer Learning in Reinforcement Learning, Distance Metrics, Transfer Performance Prediction, Source Task Creation, Curriculum Learning
PDF Full Text Request
Related items