Font Size: a A A

Research And Implementation Of The Automatic Chinese Text Categorization

Posted on:2006-07-05Degree:MasterType:Thesis
Country:ChinaCandidate:H M MaFull Text:PDF
GTID:2168360155450170Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Text categorization is the assignment of predefined categories to documents based on their content. It is a core of text mining. According to the analysis of domestic and international researches on the issue, we can know that how to enhance categorizing ability of text classifier is a key problem. Especially to Chinese, there is no agreement about the automatic Chinese text categorization yet. By comparing and analyzing the implementation technologies of Chinese text categorization, we make a farther discussion and put forward an achieve architecture of it. During constructing the system, we make detailed analyses and researches on three main parts: Chinese words segmentation techniques, feature selection and extraction algorithms and categorization algorithms. On the basis of the researches, we give our improved algorithms. At last, we discuss categorizing ability of the system by some experiments. The results of the experiments prove that our improved algorithms are effective and categorizing ability of the system is satisfied.
Keywords/Search Tags:Automatic Chinese Text Categorization, Chinese Word Segmentation, Feature Selection and Extraction, Categorization Algorithm
PDF Full Text Request
Related items