Learning control policies from demonstration in continuous sensory and action space

Posted on:2016-03-03

Degree:Ph.D

Type:Dissertation

University:University of Florida

Candidate:McLeod, Adam M

Full Text:PDF

GTID:1472390017477079

Subject:Electrical engineering

Abstract/Summary:

Learning to control a bipedal robot is a difficult control task. Learning to do so rapidly or automatically is even more difficult. This dissertation discusses techniques to take advantage of a person's innate ability to solve this kind of problem and leverage it to train a software agent. This kind of approach is called 'Learning from Demonstration', and here it is applied to a control problem that comprises the dominant underlying physical principle of legged locomotion; the inverted pendulum. Human volunteers are tasked with mastering a inverted-pendulum game, and logs of their actions are used to guide a learning model similar in structure to a reinforcement learner. When given access to the human-generated examples of game mastery, the learning model is able to find a control solution as good as or better than the human demonstrations, while a similar reinforcement learning model without human-generated data to guide it is unable to find a solution.;Encoding control policy is a related but distinct problem from policy learning. In addition to the Learning from Demonstration (LfD) experiments described above, unsupervised feature learning for policy representation is also investigated. Unsupervised feature learning, a form of dimensionality reduction, can be a vital aid to general learning tasks. High dimensionality can make analysis via dimension-reducing algorithms like Sparsenet or FastICA infeasible due to memory constraints. These techniques require a fixed quantization of the data space, which implies a certain dependence of the algorithm's performance on the size of the data space (measured in dimensions), and the quantization method used. If the data space is small to begin with, good performance can be expected. Poor performance can be expected if the space is large unless a vast amount of memory is used to finely (but naively) quantize the space, or a priori knowledge of a method that allows efficient quantization is known. However, if there exist underlying patterns within the dataset that sparsely populate the total volume of the data space, then it is possible to extract sparse codes without the need to observe whole codes, without enumerating large vectors within a learning algorithm, and without special knowledge of the underlying structure of the data being analyzed. Here, we introduce an algorithm capable of finding an overcomplete set of sparse codes that exist within high-dimensional data spaces. By using the Sparsenet training paradigm to train a set of computational models for regression (e.g. neural networks) as opposed to updating static vectors, we can learn a sparse set of codes from partial training examples. The technique also decouples the memory footprint of the system from the dimensionality of the data being analyzed, allowing the model to encode high-dimensional patterns.

Keywords/Search Tags:

Space, Data, Model

Related items

1	Study On The 3D Data Model And Algorithm Of Urban Underground Space Based On The Field Frame
2	Research On Data Space Construction Of Flexible Packaging Smart Enterprise Based On Big Data
3	The Data Analysis And Application Of Green Space Retrofit In Healthy City Environment Design
4	Research On Method Of Urban Job-housing Space Identification Based On Open Travel And POI Data
5	Based On Variable Dimension Vector Space Of The Sheet Metal Definition Of Multi-state Model Digital Technology
6	The Research And Application Of The Algorithm To Mass Spectral Data
7	Learning control policies from demonstration in continuous sensory and action space
8	Research On Sapce Debris Environment Exploration Data Processing Measure And Engineering Modeling Method
9	The Study Of Dynamic OD Estimation Based On Multi-source Data
10	Research And Design Of Electronic Data Sheet Technology In Space Data System