The Application Of MDPs’ Metric In Reinforcement Transfer Learning

Posted on:2015-04-12

Degree:Master

Type:Thesis

Country:China

Candidate:S G Fan

Full Text:PDF

GTID:2308330482478945

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Markov Decision Process (MDP) is an effective model to solve many problems in Artificial Intelligence (AI). For example, the Reinforcement Learning (RL) is build on the model of MDP. In many cases, we need to calculate the distance of two MDPs, such as transfer learning of RL which need to know the distance between source task(s) and objective task(s). Besides, when building a MDPs’library, the distance between MDPs absolutely contributes to reducing the size of library. In order to make the metric for two MDPs come true, several steps are achieved:1. metric the distance of two probability transfer functions;2. metric the distance of two states in the same MDP;3. metric the distance of two states in different MDPs;4. metric the distance of two MDPs.In this paper, a metric for measuring the distance of two MDPs with finite state space is presented. The formulation of the metric is based on the notion of metric for measuring the distance of two states in a finite states MDP with an aim towards aggregating states. Then, the metric’s properties including non-negativity, symmetry and triangle inequality are proofed. Also, the metric is applied to transfer learning of RL to illustrate the effectiveness of the metric for two MDPs. At last, it is conclusion and looking forward to improving the work.

Keywords/Search Tags:

Markov Decision Process(MDP), State, Metric, Reinforcement Learning (RL), Transfer Learning(TL)

PDF Full Text Request

Related items

1	Research On Sample-efficient Reinforcement Learning Methods
2	Research On Intelligent Exploration Algorithm Of Reinforcement Learning
3	The Reinforcement Learning Research Based On Internal State In Partially Observable Markov Decision Processes
4	Heuristic Learning Model Based On Partially Observable Markov Decision Process
5	Continuous Time Hierarchical Reinforcement Learning Algorithm
6	Research On Reinforcement Learning Based Communication Jamming Strategy Learning Methods
7	Inverse Reinforcement Learning Algorithms In Semi-markov Environment
8	Reinforcement Learning Based On Spectral Graph Theory
9	Research And Implementation Of Penetration Testing System Based On Reinforcement Learning
10	Research On Network Intrusion Detection Model Based On Reinforcement Learning