Font Size: a A A

Image And Video Understanding And Retrieval Based On Analyzing Visual Information

Posted on:2014-02-02Degree:MasterType:Thesis
Country:ChinaCandidate:K ChenFull Text:PDF
GTID:2298330434966142Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the fast development of the Internet, a variety of emerging social media are appearing in our world, such as Facebook, Flickr, Twitter, Instagram, etc. While accompanied by the emergence of these new online media, several billions of images and videos are created, shared, and spread day and day. How to understand and retrieve these images and videos more efficiently and quickly became one of worthy and interesting topic.In this article, we discuss the topic of understanding and retrieval of images and videos, and the results can be used to effectively serve the retrieval and understanding of information retrieval engine.In image reranking part, we propose to rerank the image retrieval results using a novel method which can be fitted to both objects classes and scenes classes. We first introduce the two methods:Exemplar model and Saliency Map (SM). Exemplar model is a top-down method which considers region of interest (ROI) of images from the same class containing lots of similar discriminative local features. These discriminative local features can be trained as the model of the specific class and to rerank the retrieved images by their similarities with the trained model of the query class. On the other hand, SM is a bottom-up method which uses winner-take-all and inhibition-of-return mechanisms to draw different locations in descending saliency order, and the images can be reranked by their salient scores. In experimental results, we observe that Exemplar Model performs well in object classes and SM performs well in scene classes for these two methods focus on different aspects to rerank images. Then we propose a method named ExSM which combines the advantage of Exemplar model and SM. ExSM inherits the superiority of Exemplar model in object classes and SM in scene classes and outperforms both of them in general.In human action recognition part, we propose a method which combines weight and temporal templates to recognize human action. First, considering that different parts play different importance roles in human action recognition, we propose to weight different parts based on optical flow for different action recognition as the local descriptors. Then we evaluate the temporal templates of Motion Energy Images (MEI) and Motion History Images (MHI) as the global descriptors and time information descriptors respectively. To take into account the local descriptor, the global descriptor together with temporal information, we fuse the local descriptor on weight template, and the MEI and MHI for action recognition. Experimental results demonstrate that the fused descriptor can effectively and efficiently recognize human actions from different aspects.
Keywords/Search Tags:Image Reranking, Image Retrieval, Exemplar model, Saliency Map, ExSM method, Facebook, Flickr, Video Understanding, Weight template, Temporaltemplate, Human action recognition
PDF Full Text Request
Related items