Interactive learning protocols for natural language applications

Posted on:2010-06-16

Degree:Ph.D

Type:Dissertation

University:University of Illinois at Urbana-Champaign

Candidate:Small, Kevin

Full Text:PDF

GTID:1448390002488870

Subject:Artificial Intelligence

Abstract/Summary:

Statistical machine learning has become an integral technology for solving many informatics applications. In particular, corpus-based statistical techniques have emerged as the dominant paradigm for core natural language processing (NLP) tasks such as parsing, machine translation, and information extraction, amongst others. However, while supervised machine learning is well understood, its successful application to practical scenarios is predicated on obtaining large annotated corpora and performing significant feature engineering, both notably expensive undertakings.;Interactive learning protocols offer one promising solution for reducing these costs by allowing the learner and domain expert to interact during learning in an effort to both reduce sample complexity and improve system performance. By specifying a method where the learner may request targeted information, the domain expert is focused on providing the most useful information. This work formalizes a general framework for interactive learning and examines two interactive learning protocols with particular attention to natural language scenarios.;We first examine active learning for structured output spaces, the scenario where there are multiple predictions which must be composed into a structurally coherent global prediction. Secondly, we examine active learning for pipeline models, where a complex prediction is decomposed into a sequence of predictions where each stage explicitly relies on the output of previous stages. These two widely-used models are particularly applicable for complex application scenarios where obtaining labeled data is particularly expensive. By allowing the learner to select which examples to label, we demonstrate significant reductions in sample complexity for both semantic role labeling and an entity/relation extraction task.;Secondly, we introduce the interactive feature space construction protocol, which uses a more sophisticated interaction to incrementally add application-targeted domain knowledge to the feature space. Whereas active learning restricts the interaction to additional labeled data, the interactive feature space construction protocol better utilizes the domain expert by focusing direct modification of the feature space to improve performance and reduce sample complexity. Through this protocol, we demonstrate further improvements on our entity/relation extraction system.

Keywords/Search Tags:

Interactive learning protocols, Natural language, Sample complexity, Feature space

Related items

1	Research On Machine Learning For Natural Language Processing And Transmission
2	Video-Natural Language Retrieval Based On Deep Learning And Active Learning
3	Deep Reinforcement Learning in Natural Language Scenario
4	Modeling And Learning Of Representations For Natural Language Sentence-level Structures
5	Research And Application On Method Of Generating SQL Through Natural Language Based On Deep Learning
6	Deep Learning Natural Language Generation System For Scientific Literature Based On Microservices
7	Interactive visualizations of natural language
8	Research On Natural Language Programming
9	Research On Dialogable Video Tree Based On Natural Language Understanding
10	Research Of Natural Language Evaluation Model Based On Hyperbolic Space