Font Size: a A A

Research On Emails Filtering Algorithm Based On Parallel Multiple Bayes Fusion Model

Posted on:2009-08-11Degree:MasterType:Thesis
Country:ChinaCandidate:Q HuangFull Text:PDF
GTID:2178360272992209Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the vast popularization of internet, the technology of email is widely used in people's daily life. However, the occurrence of more and more spam emails is annoying to users, which causes the great waste of computer storage, computing time and network bandwidth. As dealing with spam mails costs much of users'time which gives a bad effect on people's life, how to protect users from the trouble of spam mails has become a serious problem badly needed to be done. In this paper, the mainly work are the study and implementations of filtration algorithm for spam mails.The rest of this paper are organized as follows.The principles, architecture, protocols and formats of Email system are first introduced, which are the basic work for analyzing Email formats of filtrating and the implementation of our system.then we give the analysis of the current text classification algorithm and the criterion of the algorithm performances. We consider recall ratio and accuracy as performance evaluating index of our spam filtrating algorithm.Second, from the principle of Bayes Algorithm we analyze the main idea of Bayes algorithm and main classifying model based on it, then we propose a email filtrating algorithm which adopts the strategy of fusion of Plain Bayse algorithm, PG Bayse algorithm I and PG Bayse algorithm II combined with Neural Network. And the result of our experiments indicates that the recall ratio and accuracy has been promoted obviously, but it can reduce the efficiency.At last, adjustment strategy for weight of feature item is introduced on the basis of Emails Filtering Algorithm previously proposed by us Based on Parallel Multiple Bayes Fusion, and adjusts the weight of text feature using text evidence weight function. We design a Emails filtrating prototype system on these basis, which provides instructions of main modules of mail preprocessing, text segmentation and feature extraction in detail. Testing results for the Emails filtrating prototype system show the effectiveness and accuracy of identificating spam mail using our algorithm proposed in this paper.
Keywords/Search Tags:Emails Filtrating, Text Classification, Bayes Algorithm
PDF Full Text Request
Related items