Font Size: a A A

Mail Message Filtering Algorithm

Posted on:2003-11-30Degree:MasterType:Thesis
Country:ChinaCandidate:X J ShiFull Text:PDF
GTID:2208360092470225Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet, the on--line electronic information isbooming, then electronic mail become the fastest and most economical form ofcommunication available. Unfortunately, a lot of junk mails are popular at the sametime. The junk mails not only fill-up mail server storage space, but also make userspend much time on remove these junk mails. As a result, it is significant to explorean automated mail filter.There are two major methods on automated filtering mail: based on rule andbased on probability. Compared with the other text classifiers, NaIve Bayesalgorithm has more wide1y been used in the area of text classification for the simplymethod can classify texts correctly and more quickly. Mistaking the legitimate mailas junk will produce more loss than mistaking the junk as legitimate. However, theconditional NaYve Bayes method doesn't consider the different features between thelegitimate mai1 and the junk in the process of classifying and filtering mail and don'ttake into account the loss of misclassifying legitimate mail as junk, there are somelimitations on filtering mail.The paper introduces the princip1e. merits and limitations of the prevalent textclassifiers, and analyzes the limitation when NaIve Bayes algorithm is used fOrfiltering mail. On the basis of NaIve Bayes algorithm, a new mail filter based on therisk minimization Bayes algorithm is proposed in this paper, which can not onlyclassify the legitimate mail correct1y, but also improve the accuracy of classification.The experimental results show that the filter has better performance. Based on thealgorithm described above, a model of mail filter is completed in this paper.
Keywords/Search Tags:Feature Extraction, The Risk Minimization Bayes, Mail Filter
PDF Full Text Request
Related items