Font Size: a A A

The Research Of The Spam Filtering Method Based On The Behavior Identifying

Posted on:2010-10-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y N XueFull Text:PDF
GTID:2218360305498705Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
There is no doubt that spam has become a global distress, nowadays, Anti-spam technology can analyze the data from the transport layer, session layer and content-based filtering layer. Content-based filtering technology is the most important anti-spam technology, In real applications, content-based filtering method is still immature and the effect is limited, it consumes a lot of memory with slower speed.Compared with traditional filtering methods, Behavior identify filtering method is a new anti-spam technology; it can enhance the speed of the mail filtering and effect, and identify unknown spam content actively. It detects spam without the technical limitations of traditional pattern matching.The paper designs a mail sender's behavior model; the mail sender is divided into three categories:1,normal individual; 2, business user; 3, spam sender. From the analysis of massive data of mail's sender, receiver and subject, we found the mail's Subject repeated frequency characteristics can be used for spam filtering. So we proposed a new method for spam filtering and suggested Ml as the filter threshold. The value of M1 influenced the effects of spam filtering based on the spam senders'behavior. The mail can be treated as a spam sample rightly only if the repeated frequency of its subject character is larger than the M1.From the analyzing on the actual network data we obtained the identifying parameters, and defined the value of Ml as 45, which needed adjustment according to the different network condition.We designed an algorithm based on mail's behavior identifying to filter online mai with a regular deleting strategy as a supplement. According to the size of available memory, we can update the data and cut down the memory consumption regularly. Under the regular deleting strategy, the normal behavior of some mails sent in groups can be deleted from the queue of spam's behavior characteristics if it touches the threshold of M1,then the normal mail and spam could be distinguished. The online experimental results show that this model is feasible and effective,could filter out 72.5% of spam, which is verified by SpamAssassin with the correct rate of 86.5%, and with low computational over-head and controllable memory over-head.This model is suitable for the mail service provider to deploy at the boundaries of large-scaled networking. The filtering algorithm can work well with traditional filtering methods such as the Bayes method to against the spam, and also can provide dynamic new spam samples for them.
Keywords/Search Tags:Spam, Behavior Identify, Spam Filtering, Behavior Parameter, Subject Character
PDF Full Text Request
Related items