Font Size: a A A

BoVW Model Based Research On Image Annotation

Posted on:2013-06-16Degree:MasterType:Thesis
Country:ChinaCandidate:G ZhaoFull Text:PDF
GTID:2248330362971473Subject:Computer applications
Abstract/Summary:PDF Full Text Request
Computer and information processing equipment&technology are ever-developingwith the population of informationization in modern society. In this process, thedemand for digital image shows explosive growth, which correspondingly makes theeffective digital image retrieval a hot problem of attention. In accordance with thehabits of human cognition, labeling the digital image according to their semanticcontent so that to transform digital images retrieval into a mature text retrieval, whichis a necessary means to achieve efficient retrieval of digital images. How to use thecomputer to automatically identify the image semantic content and label them, is anbig difficulty urged to be solved in the community of computer vision and multimediaresearch in recent years. Because of the digital image’s unstructured data, semanticcomplexity, ambiguity and other characteristics,"semantic gap" lies between low-level visual features and high-level semantics. How to use computer technology to fillthe semantic gap, is still a great challenge.As a common model in the world of Computer Vision research with goodapplicability, simplicity and efficiency, BoVW has been widely used in the research ofimage annotation and showed outstanding performance. However, it is still far awayfrom practical application because of the fundamental problems about BoVW itself.This paper went on a research Aimed at overcoming the BoVW‘s vector quantizationloss and visual word ambiguity, proposed a corresponding improved formulationswhich provided the BoVW as a modeling method on image annotation a meaningexploration.The main work of this paper is as follows:1. Aimed at overcoming the BoVW model ‘s vector quantization loss and visualword ambiguity, this paper proposed a new formulation of visual word weightingscheme FWS(Fuzzy Weighting Scheme). Based on the preclustering results, FWStrains OC-SVM respectively so that to acquire visual word mapping functions. Visualword mapping is determined according to the distances between sample features and clustering hypersphere. Visual words are weighted according to the spatial distributionof clustering hyperspheres. FWS is designed to boost the visual word expressivenessand discriminativeness and experimentally outperforms TF and VWA ranged from16%to34%and from17%to30%respectively on annotation precision with twodifferent testing sets as benchmarks.2. This paper proposed a new image annotation formulation Msso-BoVW(Multiple scale space optimization Bag-of-Visual-Word) based on improved BoVWmodel incorporated with multiple scale spaces information into the analysis on imagesemantic content, aimed at solving traditional BoVW’s sensitivity to the change ofimage scales. Msso-BoVW tranforms original images into multiple scale spaces andconstructs multiple scale vocabularies. Images are represented as a family of featurehistograms with different scale. Multiple kernel learning is introduced to optimize thehistograms weights of differnet scale in order to acquire discriminative classifyingpower. The proposed model experimentally outperforms BoVW on image annotationprecision ranged from18.7%to33..6%and from19.1%to29.6%with Caltech-256image corpus and Pascal VOC2009images set as benchmarks.
Keywords/Search Tags:Image annotation, Bag-of-Visual-Word, Visual Word mapping, Fuzzy weight, Multiple scale space, Multiple kernel learning
PDF Full Text Request
Related items