Combining prior knowledge and data: Beyond the Bayesian framework

Posted on:2008-01-02

Degree:Ph.D

Type:Dissertation

University:University of Illinois at Urbana-Champaign

Candidate:Epshteyn, Arkady

Full Text:PDF

GTID:1448390005968341

Subject:Artificial Intelligence

Abstract/Summary:

For many tasks such as text categorization and control of robotic systems, state-of-the art learning systems can produce results comparable in accuracy to those of human subjects. However, the amount of training data needed for such systems can be prohibitively large. A text categorization system, for example, may need to see many text postings manually tagged with their subjects before it learns to predict the subject of the next posting with high accuracy. A reinforcement learning (RL) system learning how to drive a car needs a lot of experimentation with the actual car before acquiring the optimal policy. An optimizing compiler targeting a certain platform has to construct, compile, and execute many versions of the same code with different optimization parameters to determine which optimizations work best. Such extensive sampling can be time-consuming, expensive (in terms of both expense of the human expertise needed to label data and wear and tear on the robotic equipment used for exploration in RL), and sometimes dangerous (e.g., an RL agent driving the car off the cliff to see if it survives the crash). The goal of this work is to reduce the amount of training data an agent needs in order to learn how to perform a task successfully. This is done by providing the system with prior knowledge about its domain. The knowledge biases the agent towards useful solutions and limit the amount of training needed.; We explore this task in three contexts: classification (determining the subject of a newsgroup posting), control (learning to perform tasks such as driving a car up a mountain in simulation), and optimization (optimizing performance of linear algebra operations on different hardware platforms). For the text categorization problem, we introduce a novel algorithm which efficiently integrates prior knowledge into large margin classification. For reinforcement learning, we introduce a novel framework for defining and solving planning problems in terms of qualitative statements about the world. In compiler optimization, Bayesian prior based on an analytic model of hardware is combined with empirical measurements of performance of optimized code to determine the maximum-a-posteriori estimates of the optimization parameters.

Keywords/Search Tags:

Prior knowledge, Text categorization, Data, Optimization

Related items

1	Research On The Application Of Scene Text Recognition Based On The Fusion Of Prior Knowledge
2	Research And Implementation Of Text Categorization Algorithm In Knowledge Management System
3	A Study On The Influence Of Prior Knowledge On NN Model In Several NLP Tasks
4	The Research And Application Of Rough Set In Text Categorization System
5	Research And Application Of Intelligent Evaluation Data Mining Based On Knowledge Graph
6	The Research And Implementation Of Automatic Text Categorization For Chinese Web Documents
7	A Study On Text Categorization Based On Machine Learning
8	The effects of adjunct questions on learning from text inconsistent with prior knowledge
9	Ontology-based Construction Of Hierarchical Categorization System Of Domain Knowledge Base
10	Research Of Hierarchical Text Categorization System Based On VSM And Rule Matching