Font Size: a A A

Research On The Method Of Chinese Text Categorization Based On Machine Learning

Posted on:2010-01-09Degree:MasterType:Thesis
Country:ChinaCandidate:Y L LiuFull Text:PDF
GTID:2178360272982472Subject:Information Science
Abstract/Summary:PDF Full Text Request
With the development of information technology and the prevalence of Internet, the information capacity increases explosively. There is a great desire to develop a technology which can organize and manage the information availably high-qualitily. Text categorization technology is necessary for locating the information accurately and rapidly, it can support the information extracting effectively. Basing on machine learning, the text categorization method has shown the better performance than the traditional text categorization model, and it has become the classic example of the relevant field of research and application.This paper firstly introduces the general process and key technology of text categorization, then analyzes current research status, puts forward the main research content based on machine learning theory. Aiming at the limitation of the current text categorization system, the paper designs a text categorization model of IT field, including the construction of a IT-field text corpus, the proposal of the WDP feature processing approach and a combined classifier of SVM and NB. The model effectively improves the accuracy of the feature vector and the categorization methods, and overcomes the shortage of current categorization methods. The experiment results show that the recall and the precision of the system which adopts IT field model are promoted prominently.
Keywords/Search Tags:Text Categorization, Machine Learning, Feature Processing, Combined Classifier
PDF Full Text Request
Related items