Font Size: a A A

Design And Realization Of Face Image Retrieval System Based On Spark Framework

Posted on:2018-07-26Degree:MasterType:Thesis
Country:ChinaCandidate:A S ZhangFull Text:PDF
GTID:2348330518485462Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the era of big data,the data are going through an explosive growth.The processing of massive data is becoming more and more important.Therefore,a variety of big data processing frameworks have been developed,such as Hadoop,Strom,Spark and so on.In the field of image processing,the face recognition technology is maturing after decades of research and is gradually moving to the market.The image retrieval based on big data is also a new hot topic for colleges,research institutions and companies.Massive image data retrieval confronts with two technical difficulties:First,how to use the algorithm to reduce the overall calculation amount;Second,how to use the distributed architecture to rationally use the hardware resources to improve the computing efficiency.For the first difficulty,this paper reduced the SIFT feature from 128 dimensions to 32 dimensions via the PCA algorithm,and then uses the combination of Canopy and K-Means algorithms to cluster the 32-dimensional SIFT feature.After the word frequency vectors were obtained by counting the clustered feature matrix,the image can be classified into K classes by using the LDA model of Spark MLlib feed with all the images.Finally,the user searched image only needs to calculate the similarity with these images in the same class.For the second difficulty,this paper employed the cluster distributed computing to improve the computational efficiency.Our system adopts excellent distributing framework HBase and Spark,which can carry the distributed parallel operation.Meanwhile,the core algorithms,Canopy,K-Means,LDA and Euler distance of our system are all implemented efficiently under the Spark framework.Focusing on the big data-based face image retrieval problem,this paper designs and implements a Spark framework-based face image system.Our main contributions can be summarized below:1.The Canopy and K-Means algorithms are improved and implemented under the Spark framework.Then by comparing with the similar algorithms from the Hadoop machine learning library(Mahout)and the Spark machine learning library(MLlib),The results indicates that the improved two algorithms had achieved better performance on the Spark framework.2.The core algorithm named feature matching runs under Spark and Hadoop machine learning libraries,respectively,and the experiments demonstrated that the Spark framework has better computational performance than the Hadoop framework in terms of fast retrieval of the face images.3.The face image retrieval system based on the Spark framework has been designed and implemented.This system consists of three subsystems:database subsystem,user interaction subsystem and feature matching subsystem.
Keywords/Search Tags:Big data, Face image retrieval, Spark, HBase, Canopy-Kmeans
PDF Full Text Request
Related items