An empirical study on hierarchical text categorization

Posted on:2009-05-01

Degree:M.Sc

Type:Thesis

University:University of Guelph (Canada)

Candidate:Wang, Wei

Full Text:PDF

GTID:2448390002995684

Subject:Computer Science

Abstract/Summary:

Text Categorization is the process of automatically assigning new documents to a set of predefined categories. Although many statistical approaches have been applied to text categorization, there are still needs for understanding the strengths and weaknesses of individual methods and looking for ways of combining them for improved performance. This thesis makes a number of improvements for hierarchical text categorization, including data analysis for detailed comparison of four major categorization methods, new ways of combining features across multiple categories, a more efficient training method for K-Nearest Neighbors, data smoothing for Maximum Entropy Modeling, and different ways of combining multiple text categorization methods.

Keywords/Search Tags:

Text categorization

Related items

1	Research Of Hierarchical Text Categorization System Based On VSM And Rule Matching
2	Study On Text Categorization Method Based On Support Vector Machine
3	Research On Text Categorization Based On LDA And SVM
4	Research And Implementation Of Chinese Text Categorization Methods Based On Tree-like Keywords Set
5	A Study On M3-kNN Network And Application In Text Categorization
6	Text Categorization Research Based On Support Vector Machine
7	Design And Implementation Of Kazak Text Categorization System
8	Studies On Some Essential Problems In Automatic Text Categorization
9	Research On Chinese Text Categorization Algorithms Based On Technology Text
10	The Research And Implementation Of Automatic Text Categorization For Chinese Web Documents