Rule-based Spam Filtering System Design And Realization

Posted on:2009-08-17

Degree:Master

Type:Thesis

Country:China

Candidate:M Zheng

Full Text:PDF

GTID:2208360245461102

Subject:Computer system architecture

Abstract/Summary:

PDF Full Text Request

Email is one of the most significant impacts of the Internet. It is a simple and effective means of communication to anyone and everyone world wide. However the simplicity and ease-of-use of email is being abused by the sheer volume of unsolicited emails known as spam. Spam not only consumes a great amount of network resources, but also spreads harmful information which has a deleterious effect on the society. Anti-spam is becoming the most attractive research field all over the world. Some foreign spam producers make use of the servers and computers in our country to transfer their messages. This has threatened the Internet communication and made bad image of our country which makes the anti-spam research extremely important recently.Through the analysis and processing of spam, we designed and implemented a rule-based spam filtering system namely Aone which uses feature extraction and neural network techniques. Spam filter rules are generally static, and can not be updated in time to cope with the fact that spam is constantly changing. Our system integrated feature extraction and neural network techniques which has provided a novel ability to automatically extract and learn the changing features of spam. This approach has improved the accuracy of distinguishing spam from legitimate mails through dynamic adjustment of the traditional static rules. A good designed rule set is used to compare and analysis different parts of emails. By done this, a similarity score can be gained. A mail with a higher similarity score means more likely to be a spam.The main work of this thesis includes the followings:1. Research on the characteristics of spam by analyzing the rule sets. One part of such rule sets is provided by the famous spam filtering system namely SpamAssassin, another part is the Chinese rule set offered by CCERT. By learning and analyzing the methods of feature extraction, this paper proposed a novel rule extraction approach.2. Research on neural network theory. This can optimize the score of spam features and then produced an integrated rule set in which the more effective rules will gain more weights to distinguish between spam and legitimate email.3. Research on pattern-matching theory. A faster rule-matching algorithm was proposed to meet the requirement of fast and accurate classification.In conclusion, we can not only obtain new rules, but also easily adjust the score of each rule. At last, our experiments show that Aone has a good performance of spam filtering.

Keywords/Search Tags:

spam, nearual network, character extract, rule fileter

PDF Full Text Request

Related items

1	Design And Implementation Of Spam Short Message Recognizing System
2	Knowledge Discovery And Control Rule Extraction Based Fuzzy Neural Network
3	Research Of Spam Filtering Based On Bayesian Algorithm
4	Research On Filtering Spam Based On Global IP Reputation System
5	The Research And Implement Of An Integrative Anti-spam System Combining Rule-Based With Content-based
6	Application Research On BWS-SOM Model In Chinese Recognition In Large Character Set
7	The Design And Implementation Of Anti-Spam Engine Based-on Winnow
8	Design And Implementation Of The Integrated Anti-spam System Data Center Side
9	The Research Of The Spam Filtering Method Based On The Behavior Identifying
10	The Research Of Large-Set Offline Handwritten Chinese Characters Recognition