Font Size: a A A

Spam Filtering Method And System Realization

Posted on:2009-11-11Degree:MasterType:Thesis
Country:ChinaCandidate:X F LiFull Text:PDF
GTID:2208360245478604Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Email has become an indispensable communication way in people's lives since it was invented. While enjoying the convenience and speed that it brings, people also suffer from spams, fishing attack and other network frauds. The 2007 email threat trend report published by Commtouch showed that 95 percent of emails are spams. Spams have large negative impact on the Internet and its users. They take up a lot of system resources, waste users' time, and bring potential secure problems. Even more, it causes tremendous losses to the national economy. The spam problem has become a global problem.With the fast increament of the spams, the spam problem has become more and more serious. Facing to this problem, many countries have take measures to fight with it, hoping to give back to a quiet space to netizens through legislative means. Besides, people also do some positive research on anti-spam from the techinical aspect. From the spam producing and transmission phase to the final receiver phase, a lot of effective spam defends systems and spam filtering methods are proposed.This paper outlined the main-stream anti-spam technologies, and studied spam filtering algorithms from the machine learning aspect, putting the spam filtering problem into the machine leaning framework.Essentially, spam filtering problem is a classification problem, and it is a two-class problem. Filtering an email is basically judging whether an email is a pam or not. If it is a spam, then take the corresponding measures to filter it. Otherwise it is a legitimate email, and not to process it. Therefore, many algorithms in machine learning domain can be used to filter emails.This paper focused on four algorithms that widely used in spam filtering: Bayesian algorithm, Winnow algorithm, AdaBoost algorithm and SVM algorithm. Through comprehensive comparation experiments, the paper analyzed the above four algorithms detailedly in the influence of the different samples and the scope of their application.In addition, the paper implemented a spam filtering system which is based on the email client. The system which has integrated the above four algorithms is embedded in outlook in the plug-in way. When a new email comes, it can filter it automatically.
Keywords/Search Tags:spam filtering, Bayes, Winnow, AdaBoost, SVM, anti-spam add in, Outlook
PDF Full Text Request
Related items