Font Size: a A A

Content-based Spam Filtering Technology

Posted on:2011-06-22Degree:MasterType:Thesis
Country:ChinaCandidate:F Z LiFull Text:PDF
GTID:2208330332477192Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet, online e-mail transmission become more frequent, as people's daily life in one of the indispensable means of communication.E-mail has brought convenience to people and also brought many junk e-mail, and the growing problem of spam to people's learning and life, and a lot of inconvenience, while the security of the network had a very bad influence.Content-based filtering to solve the spam problem is the current mainstream echnology, one of which the text for e-mail, by e-mail content or other features characteristic use of text classification, information filtering algorithms to detect and filter spam, spam classifier design. The contents of this article are as follow:Introduced in the message filtering system, the application of text classification algorithm, feature selection methods commonly used classification algorithms were compared and summarized.the thesis uses the method of positive most matching to make segmentation processing to a specimen of the E-mail,and gets the characteristics of the E-mail. Traditional KNN algorithm, search is slow, the sample storage capacity dependence, this paper a method of classification, will be combined into a classification of echnology, e-mail, first of all applications processed by each classifier the classification results, in order to overcome The shortcomings of the limitations of classification, are the similarities and differences according to the classification results of KNN decision again calculated, this can play the advantages of each classifier.In the end, a frame of a spam filtering system is designed.
Keywords/Search Tags:spam filtering, text mining, text categorization, KNN, Combined Classification
PDF Full Text Request
Related items