Font Size: a A A

Research On Key Technology Of Spam Behavior Recognition Based On Data Mining Technology

Posted on:2008-06-19Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhangFull Text:PDF
GTID:2178360215483613Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Along with the high-speed growth of the Internet, spam has long become a social problem and come to the public attention. In corresponding to the research and application of anti-spam technology, spam maker improve the techniques of bulk mail delivery as well, whichever kind of anti-spam technology alone can't shoot the whole problem.This paper focuses on the research of key technology in Spam Behavior Recognition, and proposes a novel approach that modeling the characters of receiver addresses presented in spam connections. The approach is based on the Unified Theory of Information-Knowledge -Intelligence, using Association Rule and Sequential Pattern Mining in Data Mining fields and aims to be applied to recognize spam in email connection phase.We firstly prove that the earlier built Decision Tree Model of Spam Behavior can work with Content Filtering Techniques collaboratively. The integration of Behavior Recognition Techniques with techniques that are already in use is feasible and necessary considering its bandwidth saving feature.We focus on the research of key technology in Spam Behavior Recognition, and propose a novel approach that modeling the characters of receiver address presented in spam connections. The approach is based on the Association Rule and Sequential Pattern Mining in Data Mining fields and aims to be applied to recognize spam in email connection phase.Then, we model an essential aspect of the Email Behavior, the characters of receiver addresses that presented in session layer (email connection phase), by Data Mining algorithm of Association Rule and Sequential Pattern Mining. It enables the "Spam Behavior Recognition Model" can further recognize Spam Behavior of Email List Attack, Dictionary Receiver Attack and Dynamic IP Spam Sending. Experiments show that the "Spam Behavior Recognition Model" which has integrated with Email Addresses List Attack Recognition Model remains high precision rate. Meanwhile, the recall rate rises to nearly 1.5 times of the rate of earlier model that only consists Decision Tree Model. Besides, the recall rates of Dictionary Receiver Attack Recognition Model and Dynamic IP Spam Sending Behavior Recognition Model are both higher than 50%. We provide a new and effective solution to counter spam.
Keywords/Search Tags:Anti-Spam, Behavior Recognition, Data Mining, Associate Rule, Sequential Pattern Mining
PDF Full Text Request
Related items