Font Size: a A A

Rule-based Spam Filtering System Design And Realization

Posted on:2009-08-17Degree:MasterType:Thesis
Country:ChinaCandidate:M ZhengFull Text:PDF
GTID:2208360245461102Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Email is one of the most significant impacts of the Internet. It is a simple and effective means of communication to anyone and everyone world wide. However the simplicity and ease-of-use of email is being abused by the sheer volume of unsolicited emails known as spam. Spam not only consumes a great amount of network resources, but also spreads harmful information which has a deleterious effect on the society. Anti-spam is becoming the most attractive research field all over the world. Some foreign spam producers make use of the servers and computers in our country to transfer their messages. This has threatened the Internet communication and made bad image of our country which makes the anti-spam research extremely important recently.Through the analysis and processing of spam, we designed and implemented a rule-based spam filtering system namely Aone which uses feature extraction and neural network techniques. Spam filter rules are generally static, and can not be updated in time to cope with the fact that spam is constantly changing. Our system integrated feature extraction and neural network techniques which has provided a novel ability to automatically extract and learn the changing features of spam. This approach has improved the accuracy of distinguishing spam from legitimate mails through dynamic adjustment of the traditional static rules. A good designed rule set is used to compare and analysis different parts of emails. By done this, a similarity score can be gained. A mail with a higher similarity score means more likely to be a spam.The main work of this thesis includes the followings:1. Research on the characteristics of spam by analyzing the rule sets. One part of such rule sets is provided by the famous spam filtering system namely SpamAssassin, another part is the Chinese rule set offered by CCERT. By learning and analyzing the methods of feature extraction, this paper proposed a novel rule extraction approach.2. Research on neural network theory. This can optimize the score of spam features and then produced an integrated rule set in which the more effective rules will gain more weights to distinguish between spam and legitimate email.3. Research on pattern-matching theory. A faster rule-matching algorithm was proposed to meet the requirement of fast and accurate classification.In conclusion, we can not only obtain new rules, but also easily adjust the score of each rule. At last, our experiments show that Aone has a good performance of spam filtering.
Keywords/Search Tags:spam, nearual network, character extract, rule fileter
PDF Full Text Request
Related items