Font Size: a A A

Research Of Spam Filtering Based On Bayesian Algorithm

Posted on:2012-11-09Degree:MasterType:Thesis
Country:ChinaCandidate:Q ChenFull Text:PDF
GTID:2178330332492731Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet, E-mail has become the most commonly used network application which has become an important way of network communication.The research of content-based spam filtering technology is a key problem of the Internet security field.The paper described a spam filtering which was based on the Naive Bayesian algorithm and designed a spam filtering model based on the naive Bayesian algorithm.This thesis analyzes the technology of spam filter of domestic and abroad,and then introduces the background and the status of the existing spam filtering methods.Through the analysis and research, we focuses on the methods based on Bayesian spam filtering technology. This thesis introduces and analyzes the theory of Bayesian inference based on the rationale.We introduce a Bayesian-based mail filtering algorithms to improve the Naive Bayes algorithm and the experiments are the first class through the training set and the feature item prior probability of class conditional probability, and on this basis, the classification treatment of the test message, and finally to recall and accuracy was basis to determine the efficiency of the algorithm.The focus of this dissertation is on the E-mail filter technology based on E-mail contents.It is a technology to filter E-mail through analyzing the contents of E-mail.Actually,it is a matter of text categorization,i.e. to preprocess the text content of mail and then recognize spam over text categorization.In this thesis, the methods of pretreatment and text categorization are studied deeply.In this thesis,as far as the text categorization is concerned,many sorts of methods are studied and compared. Among those methods,Bayesian algorithm is a high priority, whose ways and principle are investigated deeply.In the research phase, we proposed a new method of bayes method experiment on the Ling-spam Corpus with VC 6.0 platform, and got a higher accuracy and recall rate.
Keywords/Search Tags:Naive Bayesian Algorithms, Spam, Feature Extract, Pretreatment
PDF Full Text Request
Related items