Font Size: a A A

Research On The Key Technology Of Tibetan-based Bayesian Spam Filtering System

Posted on:2014-11-03Degree:MasterType:Thesis
Country:ChinaCandidate:Q ( T s e Q u ) CiFull Text:PDF
GTID:2308330473956678Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the National investing heavily on information construction in Tibetan areas and development of international standards for Tibetan coded character set, as well as the release of a comprehensive supporting Tibetan operating system- Windows Visa, Tibetan in the world of computers and the Internet will be smooth. More and more Tibetan people use their native language in the network for information transfer and exchange. At the same time, also the Tibetan spam is budding. Bayesian algorithms in spam filtering is most popular,because of its easy to design and high decision features.This paper firstly analyzes the defects of Chinese spam system,the dangers of spam and the principle of the e-mail system, analyzes and compares a variety of e-mail filtering algorithm, focuses on mail filtering algorithm based on Naive Bayes theory.Then the paper researches and analyzes the related technologies of Tibetan spam filtering based on Naive Bayes algorithm, and focuses on the two key technologies: automatic identification and conversion technology of Tibetan encoding, the Tibetan automatic segmentation technology based on the HMM algorithm.
Keywords/Search Tags:Naive Bayes Algorithm, Spam Filtering, Tibetan、HMM
PDF Full Text Request
Related items