Parallel Filtering Model Of Spam And Research And Implementation Of The Arithmetic

Posted on:2008-04-01

Degree:Master

Type:Thesis

Country:China

Candidate:F P Wang

Full Text:PDF

GTID:2178360212485191

Subject:Communication and Information System

Abstract/Summary:

Electronic mail (e-mail) is becoming one of the fastest and most economical ways of communication available. At the same time, the growing problem of junk mail (also referred to as"spam") has generated a need for e-mail filtering. Nowadays, anti-spam measures commonly include black or white list technology, manual rules and keyword based content filtering.Another approach is using automated text categorization and information filtering to filter spam. An e-mail filtering system can learn directly from a user's mail set. Such algorithms of text categorization as Na?ve Bayes, kNN, Decision Tree and Boosting can be applied in spam filtering. However, the effectiveness of Na?ve Bayes is limited because of its assumption on arithmetic. Others algorithm are more effective but complicated to compute. Trying to resolve this problem, we propose using slipping window, just like the"pipeline"of computer theory, and we call it as parallel filtering. The experiment in public e-mail corpus shows an effective result.The contents of this article are as following: A summary about the state of the spam filtering. Introducing the normal approaches and technique of anti-spam, special about spam filtering technique.Analyzing the Na?ve Bayesian classifier. Bring slipping window to Na?ve Bayesian aiming at its limit of arithmetic, and get a parallel filtering model. Introducing the design and realization of the model, including the detail of some main modules.The experiment in public e-mail corpus and the e-mails of my own collection to test the performance of the model.

Keywords/Search Tags:

spam filtering, text categorization, Na?ve Bayes, parallel filtering, slipping window

Related items

1	The Research And Application Of Text Categorization Arithmetic In Spam Filtering
2	Research On Content-Based Spam Filtering
3	Rearch On Content-Based Spam Filtering Technology
4	Research On Content-Based Spam Filtering
5	Research On Spam Filtering Technology
6	The Research And Application Of Spam Filtering System Based On Grid
7	Research On Spam Filtering System Based On Na(?)ve Bayes Algorithm
8	A Mixture Of Spam Filtering Technology Research
9	Research On Waste Barrage Filtering Method Based On Bayesian Algorithm
10	Content-based Anti-Spam Filtering