Font Size: a A A

Research And Implementation Of Text Classification System Based On VSM

Posted on:2007-08-30Degree:MasterType:Thesis
Country:ChinaCandidate:S YangFull Text:PDF
GTID:2178360242967231Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Along with the development of Internet, network information increases rapidly. In order to make the information service more efficient and precise, it is important to get the information in Internet organized and classified reasonably. Today search engines is an invaluable tool when looking for information on the Internet. Still the power of automatic tools for information retrieval is limited. The user interface of search engines have limited expressiveness. Common search engines do not only really find information, but also limit the huge space of billions of content.This thesis focuses on text information processing in the network and investigates classification approaches to text information by integrating theory with practice. What are investigated concern with text information description and feature extracting, classifying text. This thesis consists of three parts. In the first part, we show some knowledge related to classification. In the second part, we present the definition of Text Classification, and analysis the complete process, the core of which is a classifier based on na(?)ve bayes arithmetic. Contrasted with RBF neural network, naive bayes is apparently superior in text classification. In the third part, this is the implementation part of the classification system, which illustrates the function of each module, database design, while focusing on the manipulation of 21578 data set.
Keywords/Search Tags:Text Classification, Feature Extracting, Na(?)ve Bayes Classifier, VSM
PDF Full Text Request
Related items