Font Size: a A A

Research On Large-scale Image Clssification Based On Deep Hierarchy Representation Learning

Posted on:2016-08-25Degree:MasterType:Thesis
Country:ChinaCandidate:N LvFull Text:PDF
GTID:2308330473955128Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Recent years, image classification has gradually become an important research direction in the field of image processing and computer vision. Meanwhile, it is a fundamental problem in machine learning and pattern recognition. With the continuous development of Internet and information technology, there are a large number of images constantly appear in people’s life through various channels every day. Therefore, in-depth study of large-scale image classification problem has very important theoretical significance and practical value. At present, in the field of machine learning, deep learning has grown rapidly, whose research and application are emerging in an endless stream. Based on the big data processing platform, using deep learning theory to solve the problem of image classification is a research hot spot in recent years.This thesis is based on the big data processing platform, i.e. Hadoop, and focuses on deep learning theory to solve the large-scale image classification. The concrete works are as follow:1. The algorithm of image feature extraction based on distributed K-means is proposed. This work describes the problem solved by the single K-means algorithm and its implementation process and elaborates the basic idea and implementation methods of distributed K-means algorithm. K-means algorithm is widely used to build the bag of words of visual vocabulary. Owning to the convenient and efficient implementation of K-means algorithm, based on the big data processing platform, i.e. Hadoop, the thesis adopts the distributed K-means algorithm to extract image features and finally solves the problem of large-scale image classification. First of all, clustering centers, namely dictionary, are calculated by the distributed K-means algorithm. Second, given the dictionary,the feature extraction function can be constructed to extract image features. Finally, the learned features are fed into the classifier to classify images. On ImageNet dataset and CIFAR-100 dataset, this work studies the effect of pre-processing(whitening process) on the dictionary and the accuracy of image classification. At the same time, on the STL-10 dataset, the proposed algorithm is proved to be capable to achieving desirable classification accuracy.2. Large-scale image classification algorithm based on deep hierarchy representation learning is presented. The algorithm adopts the idea of deep learning, which starts from the original pixels, abstracts the bottom features and gradually iterate to the high-level abstraction and finally obtains the more abstract image features. Combining with the distributed processing, the algorithm is empowered to be implemented in parallel. In this thesis, the algorithm consists of five layers and each layer has similar structure which includes pre-processing, image feature extraction and feature selection. Each layer extracts features from the output of the previous layer, and then feeds the extracted features to the next layer. Through the iteration of each layer, the features which can accurately express the images are obtained, which will be input to classifier for image classification. On the ImageNet and CIFAR-100 dataset, experiments are conducted to validate the effect of pre-processing(whitening), the size of the receptive field and stride on the accuracy of image classification. Finally, this thesis compares the proposed algorithm and previous state-of-the-art methods, which validates that the proposed algorithm can not only tackle the challenge of computing and storage resources, but also achieve desirable classification accuracy.
Keywords/Search Tags:image classification, deep learning, feature extraction, big data processing, distributed resources
PDF Full Text Request
Related items