Font Size: a A A

Investigation Of Spam Filtering Technologies Based On Statistical Approach

Posted on:2007-04-24Degree:MasterType:Thesis
Country:ChinaCandidate:B G ChengFull Text:PDF
GTID:2178360182494714Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Spam does much harm to Internet. Wasting much storage of hard disk of mail server, exposing kinds of harmful information to people, spam has made great trouble to society and individual. Therefore, it is meaningful to make great effort to reduce spam. Previously, spam-filtering technologies include white-list, black-list, feature-matching and scoring based rules.In this article we discussed filter spam methods using text categorization technologies in machine learning fields. But spam filter is not only a sort of text categorization questions;it is more serious when you put a legitimate email into a spam email folder than a spam wrongly classified into legitimate email folder. So the accuracy in spam filter plays an important role. We study how to compute a feature's probability and an e-mail's combined probability based on Graham's Bayesian spam filter theory. In this article, we focus on these subjects:We introduced what is spam and the history of spam;Reviewed the technologies in spam filtering and gave the weakness of every method by comparing them;Discussed spam filter strategies from e-mail's structure by introduced the current e-mail system framework;Discussed spam filter method using text category technologies, Bayesian spam filter theory, text feature selection methods and evaluation system of spam filtering.Studied statistical theories in spam filter, did much job in feature probability computing and mail combined probability;Designed and implemented a client-side spam filter named SpamFilter, based on statistical spam reducing approach.
Keywords/Search Tags:Spam, Bayesian spam filter, Statistical approach
PDF Full Text Request
Related items