Font Size: a A A

Design And Implementation Of Scene Image Classification System Based On MapReduce

Posted on:2019-12-28Degree:MasterType:Thesis
Country:ChinaCandidate:H ShiFull Text:PDF
GTID:2428330566476297Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Scene image classification is an important part of graphic image work.The performance of classification algorithm plays a key role in solving the problem of scene image classification.In the era of big data,the scene image data set is increasing gradually,and the image feature dimension is growing rapidly.In the face of massive image data,the traditional image classification algorithm has a sharp increase in computation and a sudden drop in time performance,which is difficult to apply to the processing of the image data of sea level high-dimensional scene.According to the above situation,this paper designed and implemented a scene image classification based on graphs parallel programming model prototype system.First,principal component analysis(PCA)is used to extract the SIFT feature dimension of scene images,and then random forests are used to classify features.All algorithms use the MapReduce parallel programming model for parallel programming.The main research results are as follows:(1)A PCA-SIFT scene image feature extraction algorithm based on MapReduce parallel programming model is proposed.The algorithm USES the PCA algorithm to reduce the SIFT feature of the scene image extracted in parallel.Using the Sun Database scene image Database and laboratory coal mine data,the experiment proves that the proposed algorithm SIFT feature point detection effect is obvious,and the operation efficiency is greatly improved.When processing large-scale image data sets,the system acceleration ratio shows a trend of linear growth,which fully illustrates the effectiveness of the algorithm in processing large-scale coal mine data.(2)Using the Hadoop platform,a scene image parallel classification algorithm based on stochastic forest algorithm is proposed.The algorithm consists of two parts: learning and prediction.The learning process is mainly to build random forest by generating multiple decision trees.The prediction process USES the constructed random forest to conduct voting classification on the input scene image feature matrix,and the algorithm is implemented based on the parallel programming model of MapReduce.It is prove that that algorithm is scalable and has some extensibility and good performance under the Hadoop platform.According to the above experiment process,a corresponding massive scene image classification prototype system has been designed and developed,realizing the high efficiency classification of the large number of scene image data.
Keywords/Search Tags:Scene image classification, MapReduce parallel programming model, PCA-SIFT algorithm, BOF model, Random forest
PDF Full Text Request
Related items