Font Size: a A A

Research Of Large Scale Image Retrieval Based On Cloud Platform

Posted on:2018-08-09Degree:MasterType:Thesis
Country:ChinaCandidate:S ZhuFull Text:PDF
GTID:2348330512979393Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet,multimedia technology and computer vision,a large number of multimedia data have been derived.On the one hand,these digital images provide people with a wealth of shared resources and beautiful visual experience.On the other hand,it is a realistic and urgent problem that how to organize and manage the massive and complex images,and how to find out the information we need quickly and accurately.Although the image retrieval has developed from the initial text based to the content based,using the image itself to retrieve images under big data not only inherited the difficulties of the traditional CBIR(Content Based Image Retrieval),including the discrimination of feature description,accuracy and complexity of feature matching,but also brings some new problems.Therefore,an image retrieval method that based on content,can realize parallel processing and timely response becomes a research hotspot.Thus,with the big data cloud platform which appears with big data has become a new direction for people to solve the problem.As an open source platform for researchers,Hadoop has begun to be used by researchers to solve various problems because of its unique advantages in computing and storage.In view of the above situation,this paper studies the problem of large-scale image retrieval based on the cloud platform Hadoop,and uses the Hadoop platform to realize the parallel retrieval of large scale images.The image retrieval is divided into two stages.The first stage is the rough screening stage that to get the candidate image set based on the binary Fisher vector.The second stage is the fine sorting stage that to sort the candidate image set based on the SIFT feature.Finally,we can get of the image retrieval results.The main work of this paper includes the following aspects:We propose to merge the description file of image feature to reduce the processing cost of Hadoop for small files;to binarize Fisher vector,the global feature descriptor,to accelerate matching speed of image features in environment of big data;to cache the features of query image in the distributed environment to reduce the I/O traffic.Based on those,we carry out the experiments of parallel image retrieval under cloud platform Hadoop on Holidays,Kentucky and Flickr1M datasets.And,we analyze and summarize from three aspects including file organization,image retrieval efficiency and retrieval accuracy.(1)In order to compare the image retrieval under cloud platform Hadoop,we conduct the test of image retrieval based on inverted index is carried out on a single machine and the image retrieval experiments are carried out on the Holidays,Kentucky and FlickrlM datasets respectively.(2)We analyze and discuss the image retrieval on single machine and image retrieval based on cloud platform from two aspects including scalability and experimental performance in detail.(3)Experiments show that large-scale image retrieval based on cloud platform Hadoop can obtain higher accuracy of and efficiency of image retrieval.Large-scale image retrieval based on cloud platform Hadoop has a good scalability and it has good applicability to general image retrieval.Therefore,the image retrieval based on cloud platform Hadoop has broad application prospects and good development trend.
Keywords/Search Tags:Image retrieval, Cloud platform, Binarized Fisher vector, File merge, Parallel retrieval, Scalability
PDF Full Text Request
Related items