Font Size: a A A

The Research And Implementation Of Automatic Text Categorization For Chinese Web Documents

Posted on:2008-06-28Degree:MasterType:Thesis
Country:ChinaCandidate:L LiuFull Text:PDF
GTID:2178360218457503Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet, The information resource on the web that we can gain has covered various fields in our life. But it leads to the problem of Information Overload, so the techniques of web mining and web information retrieval have been greatly developed. To deal with large-scale data involves data mining and knowledge discovery, classification is An important method.This paper generally discusses all kinds of techniques mentioned in text automatic classification, crucial techniques had been studied deeply by the test Study the methods of Automatic Chinese doeument segmentation deeply and bring forward a improved algorithm based on MM and RMM. Research text feature selection technique and according to comparing some algorithms propose the method applied in the paper——DFTF (Document Frequency and Term Frequency). It gives a stress on SVM. and implement a classifier, By performance evaluation, we think this. classifier has higher accuracy and efficieney.
Keywords/Search Tags:text categorization, data mining, SVM(Support Vector Machine), text representation, feature selection, classifier
PDF Full Text Request
Related items