Font Size: a A A

Research And Implementation Of Spam Detection Based On User Behavior Relationship

Posted on:2015-04-25Degree:MasterType:Thesis
Country:ChinaCandidate:X WangFull Text:PDF
GTID:2308330482457038Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As one of the most fundamental application of the Internet, widely used e-mail it takes up over 30% of the data traffic. With the increasing of applications, growth in the number of spam is also very fast, internet users receive a weekly spam accounted for more than two-thirds of the e-mail’s amount. Recent years, in response to the situation of the more banning the more spam, many researchers have proposed many effective spam detecting solutions. But, with the change of the actual situation, spammers are evolving the forms and methods, all kinds of technical not always play a role in long-term.This paper summarizes the existing spam detecting techniques, focusing on the relationship between the behaviors of e-mail users, on the basis of the existing behavior-based parsing spam detecting technologies, summed up the concept of e-mail user’s social relationships. According to the relationship between e-mail users, establish the network of relationship behaviors, using community relations partitioning algorithm of complex network to divide the user community, and the relationship of these user communities applied to the actual spam detecting.In the research of existed technologies of behavior identification, we consider a mail traffic behavior net as interpersonal networks. For this reason, leading to fast algorithm for detecting community structure in networks (referred to as FN algorithm), CPM (Clique Percolation Method) algorithm, and the algorithm based on node contacts, establishing the comparative mechanism of community partition obtain the optimal division.In order to improve the accuracy of determining the mechanism, take further identification to the suspicious messages detected by the mechanism of user relationship community partition, leading to Classification and Regression Trees algorithm(CART), according to the behavioral characteristics data of mail headers obtained previously, do the multiple discriminant control study on the behavior of regular mails.After verification of system implementation, combining the feedback of test dataset results, the design of mechanism on detecting based on user behavior relationship is reasonable and effective, not only reduce the load detecting of mail server, but also is with high accuracy and good recall.
Keywords/Search Tags:Spam, user relationship behavior, community partitioning, algorithm based on node contacts, CART
PDF Full Text Request
Related items