Font Size: a A A

Research On The Image Spam Filtering Technology Based On Content

Posted on:2011-12-25Degree:MasterType:Thesis
Country:ChinaCandidate:F LiuFull Text:PDF
GTID:2178360308955335Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:
Widely used email communications bring people convenience, and the same time, its by-products - spam email, gives people a troubled lives and work. A lot of spam not only waste people's time and energy, but also, those emails that containing viruses and Trojans, are more likely to bring a great threat to people's network and computer system security. Since the 20th century, 90s, scholars at home and abroad have done a lot of research, and some text-based spam filtering solutions have been proposed. In order to circumvent the filtering mechanism, an increasing number of spam emails contained the embedded images, those were so called image spam (called I-spam). According to Symantec's status spam report, In May 2009, image spam accounted for about 20% of all spam. Image spam filtering is very important.In this paper, considering the differences between the spam images and the normal images, we proposed content-based image spam filtering method, and designed a hierarchical combination image spam filtering system to realize the image spam. The contribution of this paper mainly is listed following:(1) In this paper, we analyzed some types of image features. Appling the gradient and gradient direction features to image spam filtering. This two features of images can reflect the objects'gray level changes. The experiment proved that these two features have a higher recognition rate.(2) Making use of a hierarchical combination image spam filtering system that uses feature-level combinations and filter-level combinations. The similar features are combined to form a feature vector using the feature-level combinations; the features of different types use the filter-level combinations. Experimental results showed that the proposed method not only avoid over fitting, but also make full use of the advantages of different filters. So we can achieve a high recognition rate.(3) The LS-SVM algorithm is applied to image spam filtering. LS-SVM algorithm is an improved algorithm of SVM. That algorithm has been applied to regression, classification and other fields, and it has been demonstrated that LS-SVM has better performance than SVM. This paper proved that in image spam filtering LS-SVM algorithm created better results compared with other algorithms.
Keywords/Search Tags:Image spam, features extraction, LS-SVM
Related items