Font Size: a A A

The Research Of Image Recognition Based On Convolutional Neural Network

Posted on:2018-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:D YuFull Text:PDF
GTID:2348330518986516Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Image recognition is a technology which uses the computer to process and analyze images and to identify various types of targets.Image recognition is one of the most important research topics in pattern recognition,machine learning and so on.Extracting features is a key step in the image recognition,and traditional methods of feature extraction such as Scale-invariant feature transform(SIFT)and Histogram of Oriented Gradient(HOG)are hard and time-consuming,which need be designed carefully and they might not extract the best features for recognition.The emergence of deep learning makes it possible to automatically get features from training data.Deep learning can improve the accuracy of classification and prediction by constructing a neural network model with many hidden layers,and using huge amounts of training data to learn more useful information.Convolutional neural network(CNN)is a kind of deep learning,which has become a hot research topic in voice analysis and image recognition,especially in the field of image recognition and has achieved good results.However,theory of CNN is not perfect,CNN needs further research and improvement.This dissertation focuses on CNN as the main research topic to study the main ideas and shortcomings of the algorithm,and the improved methods are proposed to solve the disadvantages and further to improve the efficiency and robustness of the algorithm.Specific research works are summarized as follows:(1)CNN is good at learning features,but not always optimal for classification,while extreme learning machine(ELM)whose parameters are calculated directly via least square method,the training error is lowest and the training speed is very fast in theory,so it is good at producing decision surfaces from well-behaved feature vector,but cannot learn complicated invariance.According to the advantage and shortcoming of CNN and ELM,this dissertation proposes a hybrid system where a CNN is trained to extract features and ELM is trained from the features learned by the CNN to recognize faces.We also propose to prefix part of the filters in the convolutional layers to reduce the number of parameters of the CNN for the purpose of improving the recognition accuracy.The experimental results show that the proposed method can effectively improve the performance of face recognition,and the method of prefixing part of the filters is better than the method of stochastic filters in small training data.(2)It needs a large number of labeled samples to train a CNN,and when the amount of the training data is small,it can't get very good effect.PCANet is a simple deep learning network,which employed the eigenvectors of principal component analysis(PCA)to simulate the convolutional kernel to avoid the process of training.Two-dimensional principal component analysis(2DPCA)is a method of improving the PCA,avoiding the drawbacks of PCA that must transform the 2D matrices into 1D vectors,which increases the amount of computation and undermines the structural information about the image itself.This dissertation proposes an improved algorithm of PCANet--2DPCANet,which replaces PCA with 2DPCA in the PCANet.The experimental results verify the effectiveness of the proposed method.(3)Global CNN activations lack geometric invariance,and in order to address this problem,Gong et al proposed multi-scale orderless pooling CNN(MOP-CNN),which extracts features from the local patches via CNN at multiple scales,then adopts Vectors of Locally Aggregated Descriptors(VLAD)to encode those local features for each level separately.However,we find that this method can improve the performance mainly because it extracts global and local representation simultaneously,and VLAD pooling is not necessary as the representations extracted by CNN are good enough for classification.In this dissertation,we propose a new method to extract multi-scale features of CNN,leading to a new structure of deep learning.The method extracts features from the local patches via CNN at multiple scales,then concatenates all the representations at each level separately and reduces dimension via PCA,finally,concatenates the results of all levels as the final features for classification.The experimental results show that the proposed method is superior to MOP-CNN no matter in accuracy or efficiency.
Keywords/Search Tags:Image recognition, Convolutional neural network, Extreme learning machine, 2DPCA, Multi-scale features
PDF Full Text Request
Related items