Font Size: a A A

LDA Model Combined With Spatial Information For Visual Object Recognition Research

Posted on:2014-06-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2268330422450633Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of computer network, we can obtain huge imageresources, but we face great challenges that is how to find an effective method toannotate image automatically, thereby reducing the cost of human resources formanual annotation and human tendency, improving the accuracy of imageretrieval.Semantic understanding of images is the key to this kind of problems. Inrecently years, many scholars introduce the Latent Dirichlet Allocation model whichis widely used in nature language processing into image object recognition, thismodel makes us understand image’s semantic easily, but there is also drawback thatit assumes latent topic assignments of visual words are conditionally independent.According to the characteristics of images, spatial information of the image plays animportant role, that is to say the generation process of the latent topics is influencedby its adjacent visual words’ latent topics, so this paper proposes LDA modelcombined with spatial information for the generation process of image visual words’topics, and use the SVM to classify the topics distribution vectors corresponding toeach image, thereby completing image object recognition.First, image features are extracted. This paper uses two types of image features:SIFT feature; HOG feature. Then the extracted SIFT features and HOG features areclustered by using online-Kmeans algorithm. Finally each image features arerepresented as the index of the visual vocabulary obtained by clustering.Secondly, LDA model combined with spatial information is designed, which iscombined the CRF into LDA model. We introduce CRF into hidden layer, so that thegeneration process of the latent topics of the visual words depends on its adjacentvisual words’ latent topics. The objective function needing optimized is proposedaccording to the model, then we use EM algorithm and variational inferencealgorithm to estimate model’s parameters, in addition Gibbs sample algorithm isused for parameter estimation of LDA model.Finally, test images are inferred by using having been trained model, to get notonly the topic assignments of visual words for each image, but also topicdistribution vector. SVM is used to classify different types of images.In this paper, training images and test images of the VOC is used forexperiments. Experiments show that the LDA model combined with spatialinformation can effective use the spatial information and improve the recognitionrate of the image object in contrast with the original LDA model.
Keywords/Search Tags:Image object recognition, LDA model combined with spatialinformation, EM algorithm, variational inference, SVM classifier
PDF Full Text Request
Related items