Font Size: a A A

Research And Implementation Of Na(i|¨)ve Bayes Text Classification Based On Cloud

Posted on:2013-05-05Degree:MasterType:Thesis
Country:ChinaCandidate:X B PengFull Text:PDF
GTID:2248330392457437Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In the age of highly developed Internet, information technology has penetratedinto the people’s daily life and the Internet has almost all the information thatpeople need. Face the challenge of how to find the knowledge that required bydifferent individuals and units, data mining is to be proposed. Text classification, asa branch of data mining, is important. How to classify the mass information on theInternet has become a major challenge in the field of information technology.This paper studying and realizing automatic classification system of webpagethrough na(i|¨)ve Bayesian text classification algorithm based on cloud platform,including web pretreatment, training and classification process in MapReduceprogramming method. Through using the method of TF-IDF-DI formula tocomputer feature item weight to increase the weight of low-frequency feature itemwith excellent classification capacity focused on in this paper. And usingincremental learning algorithm based on feature and TF-IDF-DI formula to improvehe classification ability and intelligence of classification system.The experiments show that performing the task of web classification withMapReduce improves computing efficiency to meet the demand for fast response,classifying Chinese texts with na(i|¨)ve Bayesian classification algorithm achievesgood effects, and conducting incremental learning algorithm based on feature withthe method of computing feature in TF-IDF-DI formula instead of traditionalTF-IDF formula enhances the recognition rate of the classifier.
Keywords/Search Tags:Cloud Computing, Data Mining, Text Classification, Na(i|¨)ve BayesAlgorithm, Incremental Learning
PDF Full Text Request
Related items