Font Size: a A A

The Research And Implement Of An Integrative Anti-spam System Combining Rule-Based With Content-based

Posted on:2010-11-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q ZhangFull Text:PDF
GTID:2178360275973044Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the developing and spreading of Internet, e-mail has become one of a indispensable communication mean in people's daily life. Although it brings a lot convenience, the negative influence reveal at the same time, which is a big part of the mails we receive everyday are junk e-mails.Currently there are two methods in filtering junk e-mails-- one is the way of filtering based on the rules and the other is on the basis of Bayesian Theory. The former is simple that it would use sorts of method to filter mails, such as Black,White list, Sender Authentication, Recipient Verification, Rate Control, Policy control, and so on .Comparably speaking, the filter method relying on the Bayesian Theory is more complex and its process is as follows.When dealing with an e-mail, at first, the system would apply different rules which correspond different weight (The weight can be positive or negative.). And each mail's total weight is the sum of the rules' weight it matched. If the total weight is negative, it means that the mail could be judged as a "legal mail" and dealt as it is a legal mail. The default way is to send the mail to users' inbox or the custom folder. On the contrary, on condition that the total weight is positive, it indicates it is "suspected spam". Then it would be treated as suspected spam accordingly and the mode defaulted is delivering the mail to the users' junk folder. However, if a certain mail's total weight exceeds the default threshold value, the filter would seem it as junk mail to discard directly but not send. Therefore, even there is something inaccurate with some rules, during the process of determination, it can exert effectively when combining with all other rules' results.In this paper it researches and implements an integrative Anti-spam system which bases on the spam filter system of rule-based and content-based. The test demonstrated that the rate of identifying the spam could get 92% in this integrative Anti-spam system. Moreover, the misjudgment rate was zero. In March in 2008, a Telecommunications System which had three hundred thousands of users in Guangdong firstly put this integrative mail filter system into use. During the past two months' utilization, the spam interception rate reached to 98.3% and the misjudgment rate was zero through sampling examines and calculating.
Keywords/Search Tags:Spam, Rule-Based, Bayesian Theory, Filter, Content Matching, Statistical Probability
PDF Full Text Request
Related items