Font Size: a A A

Research On The Intention Analysis Method Based On Content Of Spam

Posted on:2012-04-03Degree:MasterType:Thesis
Country:ChinaCandidate:J T SunFull Text:PDF
GTID:2178330332499266Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of information society, people are increasingly demanding to infor-mation exchange. E-mail not only improves the efficiency of information transmission, but also makes the information exchange between people become more convenient. However, with a large number of messages generated, the number of spam is increasing. In this respect, many scholars have done a lot of researches on spam filtering techniques and proposed a variety of filtering methods. The extensive application of these methods make the transmis-sion of spam get a temporary control. With the development of spam filtering technology, spammers also have adopted many more innovative means to evade spam filtering system, therefore, traditional methods of spam recognition also have some limitations in many ways.First, this article summarizes the traditional techniques, and finds that spammers often have a certain purpose when they send spam to the recipient. Second, we extracted a new intent feature of spam, and proposed a new content-based intention analysis method of spam which combines intention analysis with the traditional content-based identification method. A comparison among the experimental results indicates that the intention analysis method based on content of spam achieve better recall and precision results than traditional methods.The main work is divided into the following points:1. This article summarizes and does a lot reaseach on the traditional techniques, then make a brief description of the new trends of spam identification technology.2. Carried out a detailed study and analysis on the traditional content-based two important ways—Bayesian methods and Support Vector Machines. Then determine how to combine the traditional methods with the intention analysis method and make the traditional methods become the pre-classification of the new method through analyze the basic principles and existing improvements of traditional method.3. Proposed an intention analysis method based on content of spam and described it's principle and process. In addition, we also described two important algorithms in the new method—Intention Analysis algorithm and Intention Feedback Learning algorithm.4. We completed the system design and implementation, then completed experiments on three different data sets. A comparison among the experimental results indicates that the intention analysis method based on content of spam achieve better recall and precision results than traditional methods, respectively increased 1% to 2% and 0.5% on average.In summary, this article uses intention analysis method in spam identification technology, and full make use of the advantage of traditional identification technology, so that combining the two together. Detailed analyzed and explaned of the algorithms and processes, and through experiments showed the advantages of the intention analysis method of spam based on content of spam. Despite the improvement of precision is not very obvious, but we believe that with this new method is gradual improvement and development, it's various indicators will be a good improvement.
Keywords/Search Tags:Mail Filtering, Intention Analysis, Support Vector Machine, Bayesian method, Intention Feedback Learning
PDF Full Text Request
Related items