Font Size: a A A

Development And Research Of Spam Filter System Based On Bayesian Algorithm

Posted on:2011-08-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhaoFull Text:PDF
GTID:2178360305970632Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the Internet's development and it's wide application. E-mail with its convenient, fast and low-cost unique charm has become one of the most popular form of information exchange. But at the same time spam has brought enormous harm and caused great loss, research and design efficient spam filter system has great practical significance. Bayesian algorithm based on the content filter showed a high degree of accuracy when filter the spam therefore it has received extensive attention.In this paper, a set of spam filtering solutions is designed, after studying the principle of e-mail and the spam filtering methods, and a spam filtering system is developed. The feature of this solution is that it is a combination of keywords filtering and statistics Bayesian content filtering algorithms based on probability, the system can do feedback study according to the characteristics of spam and the needs of different users. The introduction of Chinese-term mechanism to this system makes the system has a good ability to filter the spam for both English and Chinese messages.System developed is based on the Eclipse platform, Ling-Spam corpus of spam filtering system is used for the test, the test results show that the system has good filtering effect, and good independent study filtering ability, and has certain personality characteristics.
Keywords/Search Tags:Spam, Bayesian algorithm, Word Segmentation, Filtering, Feedback learning
PDF Full Text Request
Related items