Font Size: a A A

Research And Application Of Image Feature Learning And Classifiction Methods Based On Deep Learning

Posted on:2017-03-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z Y FengFull Text:PDF
GTID:1108330503485223Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Image classification is one of the hot research fields of computer vision, which is also the foundation of other image application tasks. Image classification system usually consists of three important parts, low-level feature extraction, image representation and classification. Features take the key role of the whole system and good features can extract the information which benefit classification. To design an effective feature often require domain knowledge in the corresponding research field. So many researchers have proposed various features of the corresponding research fields. It’s clear that using these low-level features on large scale image classification directly often cannot achieve good performance. Additionally, it spends a lot of time to design and tune the low-level features, which make features developing slowly.The low-level feature is difficult to design and tune, which is the bottleneck of image classification task. So the researchers expect that the effective features can be leant from the images automatically. They found that deep convolutional neural network(DCNN) learns the low-level and high-level feartures from large number of images, and the performance of image classification is close to the level of human beings. Therefore, feature learning has become the important research area of image classification and will be applied widly.For the feature learning of image classification, we study the single-layer feature learning methods and extend them to multi-layer feature learning methods. We apply deep neural network for a pratical application. In this paper, the main work is as follow:1. We study the single-layer feature learning methods and multi-layer feature learning methods. Restricted Boltzmann Machine(RBM), Autoencoder, Sparse Coding(SC) and subspace learning are conclued as single-layer feature learning methods. After studying multi-layer feature learning methods, we consider that the supervised single-layer feature learning can be employed for convolutional neural network. 2. We propose a novel manifold-learning-based discriminative learnable feature, Discriminative Locality Alignment Network(DLANet). Based on a convolutional structure, DLANet learns the filters of multiple layers by applying DLA. In particular, we construct a two-layer DLANet. It is followed by a popular framework of scene classification, which combines Locality-constrained Linear Coding-Spatial Pyramid Matching(LLC-SPM) and linear Support Vector Machine(SVM). We evaluate DLANet on NYU Depth V1, Scene-15 and MIT Indoor-67. Experiments show that DLANet outperforms the hand-craft features and PCANet/LDANet. The proposed classification system is also competitive to other methods. 3. We propose a new max-margin minimum classification error(M3CE) training method for deep neural networks(DNNs). In contrast to softmax regression and cross-entropy, Minimum Classification Error(MCE) increase the posteriori of the true class but also to decrease the output of the most confused class. The proposed M3 CE is more appropriate for training DNNs for preventing gradient vanish. We evaluate the M3 CE on two popular datasets, MNIST and CIFAR-10. Experimental results show that the M3 CE complements cross-entropy efficiently and achieves better performance. 4. We design a DCNN for script and nature identification. In the training stage, we propose text-line input technique for CNN. The text-line input technique can capture more discriminant content effectively for both script and nature identification. Due to the lack of data, we proposed Self-Reappeared Padding Scheme(SRPS) to generate more text-line images. Additionally, we propose two-staged multi-task learning framework to learn the robust shared feature for script and nature identification, thus we can achieve both two identification results in a CNN. Finally, we evaluate 3 type CNN architectures(small, middle, large) to determine the best CNN architecture for script and nature identification. Experimental results show that the text-line input technique significantly improve the performance. The accuracy given by the two-staged multi-task learnt CNN of nature and script identification reaches 99% and 95%.
Keywords/Search Tags:Deep learning, Feature learning, Image classification, Deep convolutional neural network
PDF Full Text Request
Related items