Font Size: a A A

Image Retrieval Research Based On Spark And Deep Learning

Posted on:2019-08-12Degree:MasterType:Thesis
Country:ChinaCandidate:S Y MaFull Text:PDF
GTID:2428330545453413Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid promotion of mobile intelligent products,a lot of data have been produced,especially the exponential growth of multimedia data represented by images,which has prompted the birth of the image large data processing technology.How to retrieve massive images quickly is one of the hot topics in the research of image big data technology.In the task of image retrieval,the use of deep learning is becoming the trend of image representation,but how to train a network with fewer parameters and better performance is a difficult point.After extracting local features,the image retrieval model can be designed by VLAD(vector of locally aggregated descriptors)algorithm,but the retrieval performance of the algorithm is related to the number of visual words.In addition,it does not consider the distance between the local features and multiple visual centers.Both of these methods will affect the retrieval accuracy.In order to solve the above problems,this dissertation proposed a distributed image retrieval method with better performance,which is based on Spark big data platform,deep learning model AlexNet and VLAD algorithm.The main research work of this thesis includes:1.The deep learning model AlexNet was analyzed and an improved method was proposed.Since the deep network can be used for image feature extraction,after analyzing the basic structure and optimization method of the deep network,some changes were done from the network structure,parameter fine-tuning,and the master-slaves communication in the process of the distributed gradient descent.Finally,a deep network with better classification performance was designed.2.On the basis of VLAD,a soft and hard combination of VLAD algorithm is proposed.First,the two layer visual dictionary is used to solve the problem of the performance of the VLAD algorithm depends on the size of the visual dictionary.And considering the local feature may be related to multiple visual words,the weight coefficient is assigned by the combination of soft allocation and hard allocation.3.The implementation of the parallel retrieval system.First,we use an improved AlexNet network to extract multiple sub images from the same image by sliding windows.Then,the extracted sub images are processed by improved VLAD algorithm,and a parallel implementation of the improved VLAD vector based on the Spark platform is designed.Finally,the vector is decomposed into subvectors by product quantization,and the data of the subspace is stored on different slave nodes,and index is designed on each node to implement parallel retrieval to accelerate retrieval performance.
Keywords/Search Tags:Spark, deep learning, image retrieval, VLAD, product quantization, parallel retrieval
PDF Full Text Request
Related items