Font Size: a A A

Research On Image Spam Identification Based On Bayesian Network

Posted on:2011-03-07Degree:MasterType:Thesis
Country:ChinaCandidate:D D ZhengFull Text:PDF
GTID:2178360302993894Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The spam are evolving from the text to an image format, and seriously affecting people's daily lives. Text-based spam filtering methods can not meet the requirements, so the identification of image spam become a more practical research topic than other. Colligating the research status, the generic methods used in image recognition are comparing the image of spam to the image samples warehouse ultimately, involving huge amount of workload. Therefore researching a method base on reducing the workload to identify image spam has some value.Bayesian network is a probability-based method of uncertainty reasoning and has been an important field of application in handling uncertain information in intelligent systems such as statistical decision-making, expert systems and so on. The main tasks of Bayesian network modeling include confirming the topology of network and calculating the conditional probability distribution of the nodes.This paper will apply Bayesian network to the image identification of the image spam. Starting from the general features of the image information, it identify e-mail images to image spam filtering through extracting the relevant properties of the image characteristics. For reasoning on the unknown image, it will accomplish the reasoning process based on the image feature data with the network model topology and the probability distribution of the corresponding nodes.The main work of this paper is as follows:1) First, analyzing the data sample library of the e-mail image and the some basic features of image. For spam images, selecting the image features include color features, noise characteristics, and texture characteristics. The three features are discussed in detail and analyzed. Then giving the specific feature extraction method and extracting the relevant features data of images through MATLAB applications.2) Using the Bayesian principle and basic theory of Bayesian networks combine to the image feature data of the e-mail. Through learning the images data to construct the Bayesian network model which based on the characteristics of image, it selecting an optimal network model as the final Bayesian network model of the image from the constructed networks through a score-function of Bayesian network which is the posteriori probability. Then it learns the parameters of all nodes to gain the joint probability distribution.3) Using the constructed Bayesian network model of images and the probability distribution of each node, complete the reasoning process of the unknown image through the elimination method base on extracting the data of unknown images. 4) Finally, using the constructing Bayesian network model to reason the a series of images and obtain the reasoning results. It analyzes the reasoning results to verify the image Bayesian network model building correct and availability.This paper has completed the whole process including the data analysis of e-mail image samples, the characteristics analysis of spam image, feature selection of spam image, modeling, and the image reasoning and recognition and got a more reasonable image Bayesian network model. It used the Bayesian network model of image to identify the unknown images of e-mail, and had accuracy and low false positives.
Keywords/Search Tags:Bayesian network, image-based spam, feature extraction, structure study, reasoning, identification
PDF Full Text Request
Related items