Font Size: a A A

Semantic-Based Large Scale Video Retrieval System

Posted on:2015-07-25Degree:MasterType:Thesis
Country:ChinaCandidate:N ZhaoFull Text:PDF
GTID:2298330467463931Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Benefiting from the rapid development of computer and network techniques, tremendous amount of video information is produced and propagated quickly. Target instance retrieval from a large-scale database has become one of attractive and challenging researches in the recent multimedia areas.This paper mainly analysis two problems:automatic instance search, and relevant feedback and re-ranking technologies. For automatic instance research, we firstly extract various low-level features from the video frame. Global features include HSV color histogram and LBP texture feature, while local features SIFT with different detectors are extracted and described. We propose several fusion strategies to merge the results of different low-level features and present the visual information in a more effective way. Also, instead of using the traditional Bag-of-word model to quantize the local feature, we adopt hierarchical vocabulary tree to train a high-dimension codebook. After projecting onto vocabulary tree, a high-dimension sparse vector is formed to present the visual information. In the retrieval part, inverted index is built and counting min-tree is used to merge the posting lists for each word in inverted index, which is demonstrated rapidly and efficiently. It also saves storage space for a large-scale database. Based on these algorithms, we took part in the TRECVID competition and developed an automatic video retrieval system for instance search.For relevant feedback and re-ranking, high-level semantic information is utilized to boost the retrieval performance. Because of the semantic gap between low-level visual feature and high-level human understanding, search performance is decreased when only relying on visual similarity. Therefore, users are encouraged to give feedback based on the initial ranking list. Also, initial searching results should be reordered to offer a better user experience. Based on Markov random walk algorithm, we propose to build a semantic graph of auxiliary dataset to detect the confident samples, which are similar to users’labels. And then the scores of detected confident samples are propagated to the rest to re-rank the whole dataset.Our methods are evaluated on the TRECVID dataset, the standard Paris dataset and a France dataset introduced by us. The performance is demonstrated to match or exceed the state-of-art.
Keywords/Search Tags:content-based video retrieval, vocabulary tree, high-levelsemantic, relevant feedback, re-ranking, random walk
PDF Full Text Request
Related items