Font Size: a A A

Research On Computer Vision Feature Representation And Learning For Image Classification And Recognition

Posted on:2015-01-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z YangFull Text:PDF
GTID:1268330422981630Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Vision feature extraction is critical for image classification and recognition. Featureswith good performance could reduce the dependence on complex machine learning algorithmsto get satisfactory results, and they directly influence the performance of a whole visionsystem. Therefore, feature extraction is an important research direction in the field ofcomputer vision. During the research process, several feature extraction methods such ascolor feature, texture feature, local feature and global feature, have been proposed to solvespecific problems, and often could obtain good results. However, there are some problems forthe traditional feature extraction methods.First, as the increasing complexity of the vision tasks, the basic features could not getsatisfactory results when they are used for classification tasks directly. Therefore, featurepresentation methods are proposed to improve the performance of the features. Featurerepresentation refers to conduct vector quantization,sparse coding or other methods to get afinal feature representation of an input image. The most typical method is “Bog of Words”(BoW) model, which performs statistical analysis on the basic features according a dictionary.The methods based on BoW have been extensively studied and used in recent years (since2006), and get better results on image classification and recognition.Second, for a certain vision task, generally it requires much prior knowledge or complexparameter selections to get a satisfactory result, increasing the difficulty of the classificationproblems. To address this problem,“feature learning” method is proposed in recent years(since2007), and it works by learning features automatically from the raw pixels through aneural network structure. In general, there are two types of networks could be used for thispurpose, including single-layer neural networks and deep neural networks, and both of themget successful applications in image classification and recognition.Considering above situations, this dissertation dedicates to the research of featurerepresentation and learning based on image classification and recognition for the purpose ofget effective vision features. By analyzing the current feature presentation and learningmethods, we proposed new approaches and applied to solve the specific vision tasks. Themain work and innovations of this dissertation are as follow: 1. In this dissertation, we propose the feature extraction method for Kinect imagesbased on locality-constrained linear coding (LLC). Specifically, we extract denseSIFT features from the RGB image and depth image of a Kinect image pairs andconduct feature coding respectively. The features are used for Kinect sceneclassification and object classification, and the experiments on NUY Depth andB3DO datasets to show the features performance.2. We investigate a comparative study of several feature extraction methods for personre-identification, and propose a new feature extraction method by integrating theLLC and HSV、Lab color histogram. Additionally, due to the complexity of featureextraction methods in recent literatures, we propose a new appearance model calledObject-Centric Coding (OCC) for person re-identification. Under the OCCframework, the silhouette of a pedestrian is firstly extracted via Stel ComponentAnalysis (SCA), and then dense SIFT features are extracted followed by LLCcoding. By this, the coding descriptor could focus on the genuine body eliminatingthe influence of the background. Results from the comparative experiments alongwith the existing approaches show the OCC model significantly improves theperson re-identification rates, while several metric learning methods are used toevaluate its effectiveness.3. We analyze several feature learning methods based on single-layer networks, andproposed L2regularized sparse filtering for feature learning. This method couldguarantee the sparsity distribution of the learned features and gain bettergeneralization ability in the meantime. Classification experiments on four differentdatasets: STL-10, CIFAR-10, Small Norb and subsets of CASIA-HWDB1.0handwritten characters, show that our method has improved performance over thestandard sparse filtering.4. We also investigate the feature learning methods based on deep learning. In viewthat the recognition rates of the Similar Handwritten Chinese Character Recognition(SHCCR) in traditional two-level classification systems are not very high due to therestriction of the feature extraction methods, a new method based on ConvolutionalNeural Networks (CNN) is proposed to learn effective features automatically and conduct recognition. In addition, we use big data from a handwritten cloud platformto train the network to further improve the accuracy. The final experimental resultsshow that our proposed method achieves better performance comparing withSupport Vector Machine (SVM) and Nearest Neighbor Classifier (1-NN) based ongradient feature.In conclusion, through above work, it turns out that effective feature representationmethods could improve the performance of image classification and recognition greatly andthe feature learning methods based on single-layer networks and deep architectures couldlearn feature from the raw image data avoiding the complexity of the design of thehand-crafted features. And the feature learning is a very frontier research direction and haswide application prospects.
Keywords/Search Tags:Computer vision features, Feature representation, Feature learning, Deep learning
PDF Full Text Request
Related items