Font Size: a A A

TGang: Combined Two-Layers Large Scale Spam Filter Design And Implementation

Posted on:2009-08-09Degree:MasterType:Thesis
Country:ChinaCandidate:G Z LuFull Text:PDF
GTID:2178360242483055Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As one of the efficient cheap modern communication tools, Email has become one of the most popular Internet applications. But the more and more spam has affected the effectiveness of the people and organizations' work. The traditional techniques such as blacklist, keyword, and Email route can't solve the problem, many Artificial Intelligence and Machine Learning methods have been introduced to this area.Among all of the Machine learning methods, the statistical methods such as Naive Bayes, Support Vector Machine and Logistics Regression have been used widely and these methods also have reached wonderful results, In the TREC 2006 Spam track, the bit-entropy based methods such as Dynamic Markov Compression and Prediction by partial matching also showed amazing results.Although the most popular spam filters have showed good performance, every filter still has misclassifications, especially in the special area-spam filter, the cost of ham misclassification is much more than the cost of spam misclassification, so the False Positive rate (ham misclassification rate) of the spam filter is very important.Now the most popular Anti-Spam products combine the filters in series to reduce the whole misclassification rate and get good performance, the series combining structure would bring bad results, once the front filter misclassify the mail the filters after can't change the judge result because the there filters can't retrieval the misclassified mail any more.To the two problems above, this paper gives a two-layer combining spam filter. This filter can reduce the FP rate compared with the single filter and makes use of all the judge results of the spam filters compared with the Anti-Spam products.
Keywords/Search Tags:Anti-Spam, Combing Filter, False Positive Rate, Two Layers
PDF Full Text Request
Related items