Font Size: a A A

Research And Application Of Image Recognition Algorithm Based On Deep Learning

Posted on:2018-12-11Degree:MasterType:Thesis
Country:ChinaCandidate:W XuFull Text:PDF
GTID:2428330572455291Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Image recognition,which is the key of intelligent perception technology,is a classic problem to be solved.The main issues are:how to process the massive data regarding images and videos in an effective way,how to detect and recognize objects from images accurately and finally how to understand these images and videos like the human being's intelligence.As one of the common information in our daily lives,the characteristic of the image is informative,complex and redundant comparing with other kinds of data,thus finishing digital image processing is usually more difficult than others.It is worth noting that the human visual neural system shows the high ability of image recognition,therefore,in order to simulate the human visual system,many researchers have been modelling artificial neural network based on bionics theory.After the success of modelling shallow artificial neural network,the research of deep neural network encountered difficulties,e.g.high cost of training and local optimization solutions when researchers tried to improve the accuracy of classification and recognition.Recently the wave of deep learning has injected new vitality to the research of deep model,which has been considered even better than human in some visual challenges.Based on the research of deep learning,we model convolutional neural network to solve the problem of image recognition.The main idea of modelling is trying to simulate our visual neural system as much as possible.In our model,we use 3D neural activations instead of the 2D architecture and three kinds of layers are mainly adopted:Convolutional Layer,in which the slide way is used for local sensing and weights sharing;Activation Layer,in which we compare different kinds of activation function in case that we can find the best for our training model;Pooling Layer,in which we develop the max pooling or average pooling to decrease the feature maps.In our training step,we visualize the feature maps and weights corresponding to each layer.Meanwhile,we apply transfer learning with small datasets to avoid underfitting and high training cost.We use Caffe as our deep learning framework to model the CNN(convolutional neural network)and test the framework with the CIFAR-10 dataset,the experiment shows that using GPU with the accelerated library is quite faster than CPU and can get a high ability to process images.In our contrast test,we constructed a dataset based on the captured videos and then compared the complete training(initialize weights with random numbers)and transfer learning(initialize with trained model),and we concluded that transfer learning usually gets a high accuracy and less training time.For multi targets classification,we constructed the DIVS(deep intelligent visual surveillance)and finished multiple objects detection and classification based on our system.Meanwhile,to improve the reliability and robustness of our model,we keep on collecting the training data with the DIVS,then we apply this model to the real-world scenario with a high accuracy.And finally,in order to improve the ability to handle the massive data sets,we migrate the training system from single node to multi-node cluster based on distributed parallel computing environment-Hadoop and Spark.
Keywords/Search Tags:Artificial Neural Network, Deep Learning, Convolutional Neural Network, Distribute Parallel Computing, Spark
PDF Full Text Request
Related items