Font Size: a A A

Based On Bayes Spam Filtering System

Posted on:2011-04-20Degree:MasterType:Thesis
Country:ChinaCandidate:F ZhaoFull Text:PDF
GTID:2208360308966656Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of internet, E-mail has become an indispensable way of communication. It is in people's good graces for its convenience. It is a certain trend for the E-mail to be the main way of communication. However, with the E-mail becoming more and more popular, some adversaries send spam which may include many advertisements or illegal messages. The spam not only wastes internet flow as well as the time of user, but only brings great inconvenience to the user's work, life and study, therefore it becomes especially urgent to find a feasible and effective method of anti-spam.The thesis researches the main anti-spam technique home and abroad for the need of national"242"safety project, and designs an anti-spam system. The thesis will use the anti-spam technique based on content which has good filter effect. Between the anti-spam technique based on content, the classify effect of Bayesian algorithm is superior to others obviously, so a spam filter based on Bayesian algorithm to filter spam is designed in the thesis. Besides a new measure to improve the traditional Bayesian filter is brought forward. We resolve some matters which we encounter when realize the incremental Bayesian filter.The main work is as follows:(1)The latest spam filtering technology is studied, and the advantages and disadvantages of all the technology is compared to find the optimal one and realize it.(2)The main technology of Pre-Processing(including mail decoding, Chinese word segmenting, characteristic words extraction ) is studied and analysed, and the technology which is fit for the system is selected.(3)Realize the mail Pre-Processing process, which includes mail decoding, Chinese word segmenting, characteristic words extracting and so on(4)Realize a spam filter, including training process and classify process, and confirm the best status of several parameters to improve the precision of the system(5) Analyses the advantages and disadvantages of the Bayesian filter resolve some matters which we encounter when realize the incremental Bayesian filter.
Keywords/Search Tags:Bayesian algorithm, Spam filter, Pre-Processing, word segmenting, characteristic words extracting
PDF Full Text Request
Related items