Font Size: a A A

Key Technology Of Large Scale Cross-Eedia Data Retrieval

Posted on:2016-08-29Degree:MasterType:Thesis
Country:ChinaCandidate:J L ZhaoFull Text:PDF
GTID:2298330467992093Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the development of computer technology and popularity of the Internet, it becomes more and more convenient to access and share multimedia information such as images and videos. So how to quickly and accurately find the desired multimedia data from vast amounts of data become a hot research issue in the field of information retrieval.In this paper, we focus on three aspects:instance-based object retrieval, Hadoop based image retrieval and web video clustering for hot events. The main work includes:Firstly, we design and implement a BoW-based object retrieval system by improving performances of several core parts:feature extraction, similarity metric, feature weighting scheme based on foreground and background information, multiple queries combining and query expansion. The system is evaluated on TRECVID INS2013and2014video dataset. Experimental results indicate that the proposed method achieves higher mAP than traditional methods. Among22participant teams in TRECVID INS2014, our results rank the sixth place, which verifies the method’s effectiveness.Secondly, a Hadoop based image retrieval system is developed to improve the speed of image retrieval. We use HDFS to store image library and offline data files, and apply MapReduce framework to extract features and measure the similarity. Experimental results on standard Corel image dataset show that, our method can effectively reduce CBIR system processing time and meanwhile keeps the same query precision of single-node retrieval mode. Thirdly, We propose a hot event video clustering method based on multi-modal fusion. First, video title texts are segmented into words and Fisher vector is used to represent video content. Then three similarity block matrixes text-modal, video-modal and text-video-modal are constructed respectively. Finally, spectral clustering algorithm is applied to combine multi-modal similarity matrix. The clustering result is evaluated on a self-built hot event dataset. Experimental results show the effectiveness of the proposed method.
Keywords/Search Tags:object retrieval, bag-of-words, Hadoop, hot eventvideo, spectral clustering
PDF Full Text Request
Related items