Font Size: a A A

Research On The Filtering Of Spam Based On Behavior Recognition

Posted on:2010-02-19Degree:MasterType:Thesis
Country:ChinaCandidate:Y F HuFull Text:PDF
GTID:2178360278467022Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Along with the high-speed growth of the Internet, E-mail has played a more and more important part in the daily and working life, at the same time, spam has long become a social problem and come to the public attention. In corresponding to the research and application of anti-spam technology, spam maker improve the techniques of bulk mail delivery as well, whichever kind of anti-spam technology alone can't shoot the whole problem.This paper focuses on the research of key technology in Spam Behavior Recognition, and proposes a novel approach recognizing an E-mail whether it is a spam or not. The approach is based on the Unified Theory of Information-Knowledge -Intelligence, using Association Rule Mining in Data Mining fields and building Behavior Recognition model aims to be applied to recognize spam in email connection phase.We firstly analyze the SMTP protocol and techniques of bulk mail delivery, then abstract the behaviors which can distinct the normal E-mail and the spam, and convert every spam log into record which include behaviors existing in the spam log.Then, we get the Associate Rules by Data Mining algorithm of Association Rule Mining, we further get the discrete data by processing the Association Rules, and choose the C4.5 algorithm and Universal logic algorithm to build Decision Tree Model and Universal logic model based on the same training data, at last, we test the models based on the same testing data, the testing result shows than the recall rates and accuracy rates of recognizing spams are all higher than 90%, when the threshold is equal to 0.1502, the recall rates and accuracy rates of Universal logic model are higher than the recall rates and accuracy rates of Decision Tree Model. if combined with the techniques based on the content Recognition, not only can reduce the stress, but also can perfect the techniques based on the content Recognition, finally improve the recall rates and accuracy rates of spam filtering, so we provide a new and effective solution to filter spam.
Keywords/Search Tags:Anti-Spam, Behavior Recognition, Data Mining, Associate Rule, Decision Tree
PDF Full Text Request
Related items