Font Size: a A A

Visual Feature Exaction And Semantic Annotation Towards Image Retrieval

Posted on:2016-05-13Degree:MasterType:Thesis
Country:ChinaCandidate:C WangFull Text:PDF
GTID:2308330464967976Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
At present, there are mainly two types of search engines oriented to images:one is the image retrieval system based on text information, which conducts text description manually according to the picture information. And the method was put forward in the early days when the amount of images is not large. Now faced with the vast amount of pictures, hand-marked way is too labor-consuming, and involving strong subjectivity, the marking results are largely influenced by the cognitive of labeling people for judgment and related to mental activity greatly. Therefore this approach has been gradually unable to meet the requirements of the present. The other is the image retrieval system based on image visual content. Mainly extracting stable visual features of the image to form the descriptor, distance similarity indexing techniques is introduced to retrieve and return to the similarity sorting by the ranking underlying visual features of images. By extracting stable image visual features, underlying the establishment of distance similarity indexing techniques to retrieve and return the underlying visual features similar images. But people are in the habit of thinking image retrieval as semantic, and the images which have similar underlying visual features may be not really similar in semantics, whereas the images having different underlying visual features may show the same expression of semantic information. This is a common sense of "semantic gap" caused by content-based image retrieval.By forming the mapping relationship between underlying image visual features and high-level semantic on the basis of machine learning, a semantic annotation model is established combining two learning approached as supervised learning and unsupervised learning. The Dense SIFT is adopted as SIFT local descriptors to complete the feature extraction and description. Because the dimension of feature is too large, every dimensionality of image feature needs to be reduced. And then the image features are expressed by the methods of Bag of Word (BOW), Vector of Locally Aggregated Descriptors (VLAD) and Fisher Vector (FV), respectively. The advantage of FV is larger than that of BOW in vector coding because it uses less visual dictionary to form the mid description with more details. To make up for the lack of spatial information included in image feature descriptor, the method of image vector representation into the space of Pyramid is proposed, and support vector machine is applied for annotate image semantic finally.Image semantic annotation model is established, and the image retrieval can be realized through the semantic features. But too many similar images exist in the semantic space, and sometimes the results still cannot meet the needs of users. Hence on this basis of the model, image visual characteristics are added to the image retrieval system as an auxiliary retrieval, so that the user can sort the images according to visual similarity to be further precise searching. The proposed method gives consideration to the advantages of both methods by taking into account the semantic characteristics obtained by image supervised learning ways and low-level visual features on the basis of the unsupervised learning, not only achieves more consistent with human thinking habits of image retrieval methods, but also greatly improves the effectiveness and increases the accuracy of image retrieval system through seeking visual concepts information natural language.
Keywords/Search Tags:Image retrieval, Feature extraction, Spatial Pyramid, Image Semantic
PDF Full Text Request
Related items