Font Size: a A A

The Design And Implementation Of Spam Email Filtering Technology Based On Content Analysis

Posted on:2014-10-20Degree:MasterType:Thesis
Country:ChinaCandidate:P FangFull Text:PDF
GTID:2268330425967861Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The e-mail has brought human communication a revolutionary change. This is afast asynchronous technology to achieve information transmission. One can at any time,any place received messages. However, e-mail brings convenience while it has alsobeing heavily abused. Today, the spam problem has becoming harm the normaldevelopment of the Internet, and even cause great damage. Therefore, how to achievethis kind of spam filtering accuracy in recent years becomes a hot research topic. In theanti-spam filtering, Bayesian algorithms completed perfectly, Bayesian textclassification technique is the most widely used, the results usually show best spamfiltering. It not only takes up less system resources, but also saves computing time,especially in the Latin-based anti-spam filtering perfect. However, in the Chinesemessage sets the treatment is still not ideal. Segmentation refers to the continuous stringaccording to different specifications, the new combined into sub-sequence of a process.However, due to language differences, this method uses Chinese word segmentation iscompletely unworkable, Based on the Chinese word segmentation analysis, to develop agood Chinese word segmentation method, and applied to the anti-spam systems.Therefore, this paper mainly for Chinese word segmentation algorithm and Bayesianspam filtering applications in research and implementation, and use of the ChineseAcademy of Sciences provides word breakers, using Bayesian algorithms to write aspam filter. Design a content-based analysis of e-mail filtering system. First of all textmessages are classified, and then calculated based on Bayesian algorithms, whenexceeded a certain threshold is determined to be spam, otherwise regular mail. Formessages received partial, given the current market conditions the service provider, thisarticle will use the simple acceptance model, to achieve a biggest simulate realisticenvironments.
Keywords/Search Tags:Spam Email filtering, content analysis, Chinese word segmentation, NaiveBayes
PDF Full Text Request
Related items