Font Size: a A A

Image Classification And Retrieval Based On Semantic Distance And Features Combination

Posted on:2015-01-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:W WuFull Text:PDF
GTID:1228330467462698Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Image classification and retrieval have drawn considerable attention in the field of computer vision. With the development of information technology, the image data is growing quickly, which makes it difficult for users to find the desired image. In addition, with the development of novel technologies, user needs are constantly changing, which increases the gap between the user demand and image semantics. Recent years, various techniques have been proposed and attempt to solve the problem of semantic image understanding, but they are turned out to be extremely challenging due to the large intra-class variations and inter-class similarities, and other difficulties, etc, and they cannot completely solve these problems.This thesis mainly concerns some critical techonlogies about image classification and retrieval, including feature extraction, classification models and multiple features fusion strategy. Our work and proposed some novel methods are illustrated as follows:1. we propose a novel Gaussian Mixture Language Model which can resolve the shortcomings of traditional bag of visual words (BoVW) based model. We firstly take full advantage of image semantic information to learn a new distance metric which can achieve the minimal loss of image information, and then we train Gaussian Mixture Models (GMM) using this distance metric for generating codbook. When given a test image, it firstly forms a visual document using this codebook, then its category is determined by estimating the maximum probability using language model under a specific category. Experiments on Caltech101database show that the codebook generated by our method can effectively represent the image semantic information, and furthermore, they are very suitable for language model, experimental results are satisfactory and competitive compared with traditional BoVW based and state of the art methods.2. We propose a Nearest Neighbor (NN) model based on image semantic distance, which can compensate for the loss of information and minimize the semantic gap for intra-class variations and inter-class similarities. Firstly, we learn new distance metric using image semantic information, then we utilize this distance metric to compute the distance between the image and cluster center within an image category, and construct a NN-based classifier. Experimental results on ImageCLEF annotation dataset show that our model outperforms the traditional methods, and is competitive compared with the state of the art method.3. We combine the multiple visual features to our NN model, and build multiple classifiers based on our NN model. Furthermore, we utilize textual information of each image tagged by users to form entropy weighted NN-based model, which can further improve the annotation performance. Similarly, Experimental results on ImageCLEF annotation dataset confirm our methods.4. We propose a novel multiple classifiers based on an improved SVM for the large scale image annotation task. We firstly introduce a histogram intersection distance for SVM kernel function. Then we change the output of original SVM into the distance to hyperplane, and we design a set of logic decision rules for SVM classifiers, and furthermore, we add a probability weight to each SVM classifier for improving performance. We also propose a feature selection method for our model, which doesn’t require training image and directly compare the correlations among the different visual features. Then we construct multiple classifiers based on improved SVM using our selected visual features. Experiments on remote sensing images and ImageCLEF dataset confirm this model’s effectiveness.5. We combine the visual features together with textual features, and construct multiple classifiers. Our experiments on ImageCLEF2012show that the user textual feautes play an important role in semantic image annotation and retrieval tasks. Particularly, our experimental result in retrieval task is ranked at the best level, which confirms the importance of user textual information.
Keywords/Search Tags:Image Classification and Retrieval, Distance Metric Learning, Features Combination, Pattern Classification, Bag of Visual Words
PDF Full Text Request
Related items