Font Size: a A A

A new feature selection method based on support vector machines for text categorization

Posted on:2007-09-23Degree:Ph.DType:Dissertation
University:The University of MississippiCandidate:Xu, YaquanFull Text:PDF
GTID:1448390005978157Subject:Business Administration
Abstract/Summary:
Text categorization is a task that classifies natural-language text (or hypertext) documents into a fixed number of one or more predefined categories based on content, and one which covers a number of different areas, including email filtering, Web searching and office automation. Most text categorization was done manually in the past. As the volume of electronic information has dramatically increased in the last 10 years, human categorization has been limited by time and cost. Consequently, interest is growing in the development of technologies for automatic text categorization, which can better help people find, filter and manage information resources. As a new machine intelligence paradigm, the Support Vector Machines (SVMs) have tremendous potential for helping people to organize resources. The purpose of this dissertation is to discuss a new method of feature selection based on the SVMs, and to demonstrate the effectiveness of this process. This dissertation also demonstrates that an applied method with SVMs improves categorization performance and reduces the amount of time required to configure a learning machine.
Keywords/Search Tags:Categorization, Text, Method, New
Related items