Font Size: a A A

Research On Filtering Technology Of Spam Communication Behavior Detection Based On Decision Tree Algorithm

Posted on:2009-02-01Degree:MasterType:Thesis
Country:ChinaCandidate:H B WangFull Text:PDF
GTID:2178360245486311Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the advancement of science and technology and the development of computer network techniques, the Internet age is coming. It's arrival has completely changed people's way of life, and more and more people had been engaged in the network,enjoying the various conveniences that network brings. However, as the Internet is rapidly expanding, many problems have become the core issue that can not be ignored in the internet development, Spam flood is even more conspicuous.Based on the traditional methods on e-mail filtering technology, this paper pay more attention to improving the filtering speed and saving network resources, putting forward the model below.Aiming at the current e-mail filtering technology based on e-mail article, the resolution is the problem of low speed on scan, and consumes a large amount of network band resources. put forward a concept of communication behavior detection, combining data mining technology, put forward a new filtering method of spam communication behavior detection based on Decision Tree algorithm, which apply data mining based on decision tree to email filtering system, improved about the C4.5 algorithm, which is more suitable for quick handling of large log data. the log of mail server was discredited before data processing according to the E-mail log data integration features, reduce the influence of continuous-valued attribute,construct a decision tree based on information entropy theory. Finally, this paper take advantage of pruning processing technology to prune the decision tree, overcome the shortcomings of slow speed and a lot of branches for processing date of decision tree. This filter technology work in the session layer, which means, the spam can be headed off before it sends mail data, so, network band resource can be saved, also, this system has good precision and recall rate.Research shows that the method is good, and greatly improves mail server's spam filtering capabilities. The communication behavior detection technology put forward in this paper provides a new and effective solution to spam, also has a good perspective of application.
Keywords/Search Tags:Spam, Data mining, Communication behavior detection, Decision tree, Information entropy
PDF Full Text Request
Related items