Font Size: a A A

SimHash-based Massive Video Retrieval

Posted on:2016-05-17Degree:MasterType:Thesis
Country:ChinaCandidate:X G LuoFull Text:PDF
GTID:2308330470960214Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
On the Internet, because the video is often copy, edit and upload again, resulting in a lot of similar or repeated video, the video similarity retrieval based on content c an effectively solve the problem of the class, for video content publishers and regulato rs to monitor video content also relies on the similarity of video retrieval. With increa se in the number of multiple video data and network video users continues to grow, h ow to efficiently to quickly retrieve the large-scale video data is becoming a hot spot of research.Therefore, this paper adopts SimHash algorithm for video key frame building char acteristics, thus will massive video retrieval problem into hamming distance retrieval p roblem, based on this, advances a kind of based on Bloom Filter algorithm hamming distance retrieval method, the method of Sim Hash signature in the library all signature exhaustive its hamming distance within K all signatures, and will Bloom Filter struct ure summary together similar BitMap structure, the final query hamming distance, just need to calculate the BitMap and sets, to improve the query efficiency. For huge am ounts of video retrieval space complexity problems of feature extraction, the introducti on of graphs framework, design graphs algorithm for distributed processing solves the problem of space complexity.The innovation of this work are as follows:1) In view of the existing simhash signature matching the complicated high degre e of time, the introduction of Bloom filter structure, improved simhash signature to fi nd, through CC_WEB_VIDEO experimental data sets of test showed that in the guara ntee under the premise of the recall rate and precision rate, compared to the method i n this paper and Zhang LSH method based on, the number of video to 12790, algorit hm efficiency increased by 2 times.2) Extracting features, build an index for huge amounts of video retrieval high ti me complexity and space complexity of our existence question, introduced graphs fram ework, design the parallel processing algorithm based on graphs, compared with single feature extraction methods, the method based on graphs to improve the time efficienc y and space efficiency, and can be linear scaling.
Keywords/Search Tags:Massive video retrieval, Sim Hash, Bloom Filter, MapReduce
PDF Full Text Request
Related items