Font Size: a A A

Spam Filter Based On Support Vector Machine Theory Model

Posted on:2007-07-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2208360185956692Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the prevalence of internet, electronic mail, by the advantage of its rapidness and convenience, has gradually developed as one of the most significant corresponding means for people's work and everyday life. However, the coming up spam problem becomes serious increasingly, either. It will not only spread illegitimate information, but also consume large capacity of public internet resources and aggrieve email users' and enterprises' legitimate rights. So far, there exist many kinds of spam filtering methods. The situation now is that the spam problems are not well solved to be depressed but increased instead. It demonstrates that although there are many spam filtering methods, some relative issues haven't found the appropriate solutions and the filtering effect is not up to the ideal state. Thus, it is still quite meaningful to research and acquire a more highly-efficient and fast spam filtering system.Support Vector Machine is a newly-developed pattern recognition method based on the statistics theory. It represents particular advantages when solving limited examples, non-linear and high-dimenison pattern recognition issues. It considers the requirement for extension ability while pursues the most optimal result under the condition of limited examples. Among all of the SVM algorithms, SMO (sequential minimum optimization) algorithm is a relatively simple method. It reduces the number of examples in the work example set to 2, so that it avoids the process to compute complicated numeral optimization problems. But in the other hand, the cost is the increase of iteration. This paper proposes a SVM theory based spam filtering model, which uses an improved SMO algorithm, called SMO active learning algorithm. This model is for research only, and is experimented to observe its applicability and effectiveness. The primary experiment shows that the model performances well and its training time is short.This paper first introduces the basic knowledge for spam, including the definition of spam and its jeopardy, and then mentions the existed spam filtering method. SVM based spam filtering method mainly belongs to the content filtering field, so the text...
Keywords/Search Tags:Spam filtering, Support Vector Machine, SMO (Sequential minimum optimization) theory, SMO active learning algorithm, Feedback learning technology
PDF Full Text Request
Related items