Font Size: a A A

Research On Person Re-identification Based On Hadoop Platform

Posted on:2019-03-24Degree:MasterType:Thesis
Country:ChinaCandidate:M ZhanFull Text:PDF
GTID:2428330563459585Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of networks and self-media,the era of big data is coming.However,how to effectively analyze and process these big data,especially to extract valuable information from massive image data,has become a research hotspot in various fields.Pedestrian imagery in large-scale image data accounts for a large proportion and is an important part of these data.It also has certain applications in monitoring security,criminal investigations,finding lost elderly people and children,and e-commerce management.Pedestrians recognise this.It also becomes an important basis for identifying individuals in these areas.In the face of massive pedestrian image data,how to efficiently and quickly perform pedestrian re-identification has become a key issue in these fields.Therefore,research on how to improve the efficiency of re-identification of massive pedestrian images is a subject with theoretical research and practical application value.At present,cloud computing and distributed computing have become cutting-edge technologies for processing large amounts of data.Among them,Hadoop distributed computing framework is the most popular one.It has been widely used in academia and industry for massive file processing and has achieved good results,but for massive amounts of data.The study of image file processing is not mature enough and needs further efforts to improve.This topic is based on the Hadoop platform from the perspective of the parallelization of pedestrian image re-identification algorithm and the optimization strategy of computing model.It studies the massive pedestrian image re-identification algorithm and efficient calculation and processing of massive fragmented pedestrian image data.The main research work of the thesis is summarized as follows:(1)The background color is collected from the codebook(CB),each pixel is counted to estimate the parameters based on the Gaussian Mixture Model(GMM)distribution,and the efficient CB method can build and update the background model in real time,and use the GMM distribution from the CB cluster.Calculate its parameters to detect the foreground area.CB-based GMM learning uses parallel partitioning and computation with a distributed multi-threaded architecture that extends multiple nodes and allows data to be shared among threads within the same process,eliminating redundant copies of data within the process.(2)Research on the Hadoop image processing interface(HIPI)optimizes the storage of pedestrian images and the parallelization of pedestrian image data.The user only needs to specify a HIPI Image Bundle(HIB)as input,and HIPI is responsible for sending the floating image to the Mapper and parallelizing the tasks.HIB is a collection of images represented as a single file on HDFS.HIPI is executed on the cluster through Map Reduce parallel programs to facilitate efficient and high-throughput image processing.(3)A spatially-constrained and quadratic similarity learning algorithm is proposed for the global appearance of the spatial distribution of potential changes in pedestrian recognition.Through the Hadoop platform,the parallelization of spatial metrics learning algorithm is realized,the parallel process of feature extraction and feature recognition for pedestrian recognition is designed,and the parallel computation model Map Reduce is used to efficiently store and calculate the polynomial feature Map of the quadratic similarity function,and further calculate local similarity and global similarity of the pedestrian image.
Keywords/Search Tags:Hadoop, Pedestrianre-identification, HIPI, Spatial Constraints, Quadratic Similarity Learning
PDF Full Text Request
Related items