Multi-class classification in natural language processing

Posted on:2002-01-13

Degree:Ph.D

Type:Thesis

University:University of Illinois at Urbana-Champaign

Candidate:Even-Zuhar, Yair

Full Text:PDF

GTID:2468390011497378

Subject:Computer Science

Abstract/Summary:

A large number of important decision problems in the natural language domain can be viewed as problems of resolving ambiguity based on properties of the surrounding environment. For example, consider a word prediction task, i.e., predicting a missing word in a sentence. This problem can be viewed as classification problems in which the goal is to select a class label from a collection of class label candidates. Additional examples of such problems include part of speech tagging, word-sense disambiguation, accent restoration, word selection in speech recognition, etc.; Machine learning methods have become the most popular technique for addressing a variety of classification problems. However, in many natural language classification problems one needs to deal with two significant sources of difficulty: (i) The information, which is readily available in the sentence in the form of words, is not sufficient to resolve ambiguity by the learning algorithm. (ii) Large number of class label candidates.; General purpose learning algorithms are not suited to handle multi-class classification problems well. Therefore, avoiding it by focusing on problems with a small set of candidates (typically two).; This thesis addresses the aforementioned difficulties. We develop a model for multi-class classification that works by sequentially reducing the number of candidates. This model is combined with a strategy for extracting expressive knowledge from the sentence to improve the accuracy of the final classifier. Thus, we decompose the classification problem into two modules: (1) Disambiguating among a small set of class labels. (2) Reducing the number of class label candidates. Given an instance of the task and a large set of candidate class labels, reduce the number of candidates by taking a new “multiplicative-like” approach towards classification. We name this approach sequential model.; This thesis presents theoretical and empirical arguments for the advantages of using: (i) Sentence structure. (ii) The Sequential Model . Empirical arguments are given using word-prediction and part of speech tagging tasks. Theoretical arguments present this thesis as an extension of the current classification methods which aim at disambiguating among many classes.

Keywords/Search Tags:

Classification, Natural language

Related items

1	Multi-class classification in natural language processing
2	Research On Text Classification Based On Natural Language Processing And Machine Learning
3	Research And Application Of Text Classification Based On Natural Language Processing
4	Short-Spoken Language Intent Classification With Conditional Sequence Generative Adversarial Network
5	Research On Attention Neural Network And Its Application In Natural Language Understanding
6	Intelligent Device Text Classification Method Based On Natural Language Processing
7	Generating Natural Language Discription For Vehicle Trajectories Based On HMM
8	Deep Learning Natural Language Generation System For Scientific Literature Based On Microservices
9	Research On Natural Language Programming
10	The Methodology And Implementation Of Chinese Natural Language Query In Databases