Font Size: a A A

Research On New Algorithms For Learning Image Feature Representations

Posted on:2017-04-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:B Y XieFull Text:PDF
GTID:1108330485960337Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In many visual recognition tasks, one of the fundamental difficulties is to find dis-criminative image representations (also called high-performance image features). De-signing good image features is a very challenging task because on one hand, image representations should be robust to the inner-class variations; on the other hand, image representations should be powerful enough to discriminate the inter-class differences a-mong categories. Image features include patch-level features and image-level features (also called local features and global features). As the name suggests, patch-level fea-tures depict an image patch, while image-level features describe the whole image. In this dissertation, we focus on how to represent image features and propose effective algo-rithms for generating patch-level and image-level features for scene/object recognition. Our main contributions are:(1) Firstly, this dissertation presents a new image-level representation for image clas-sification. As we all know, traditional BOW (Bag-of-Words) model lost discrimi-native power with completely abandoned image feature’s spatial information. So, we present spatial correlogram approach, which captures spatial co-occurrences of pairwise codewords. This representation augments traditional Bag-of-Words model by adding spatial information into it and compresses the information con-tained in a correlogram without loss of discriminative power. For the purpose of increasing discriminative ability of image features, we combine the correlogram with spatial pyramid. In several scene/object recognition experiments, we find that, the proposed method reaches good performance and high classification accuracy compared with traditional Bag-of-Words model.(2) Secondly, this dissertation presents a new patch-level representation, called ef-ficient kernel descriptor (EKD). Designing patch-level features is essential for achieving good performance in computer vision tasks, such as image classification, object recognition etc., but the difference between artificial patch-level features is not good enough for reflecting the similarities of images. Kernel descriptor (KD) method offers a new way to generate features from match kernel defined over image patch pairs using KPCA (kernel principal component analysis) and yields impres-sive results. However, all joint basis vectors are involved in the kernel descriptor computation, which is both expensive and not necessary. To address this problem, we present a new algorithm to generate EKD feature, which is built upon incom-plete Cholesky decomposition.Efficient kernel descriptor automatically selects a small number of pivot features to achieve better computational efficiency. Perhap-s due to parsimony, we find surprisingly that despite efficiency, the efficient ker-nel descriptor approach achieved superior image/scene categorization performance than the original kernel descriptor approach.(3) Thirdly, we present a new image-level representation based on the method of con-structing efficient kernel descriptor (EKD), called efficient hierarchical kernel de-scriptor (EHKD). Original kernel descriptor (KD) was only used for image patch, so Bo et al. present hierarchical kernel descriptor (HKD), which is virtually a re-cursive application of the kernel descriptor, for describing whole image. However, like the construction of kernel descriptor, the intrinsic computational problem in the original kernel descriptor approach is not avoided. So, we present a new al-gorithm to generate EHKD feature, which is also built upon incomplete Cholesky decomposition. Efficient hierarchical kernel descriptor recursively applies efficient kernel descriptor to form image-level features layer-by-layer. Experimental results show that efficient hierarchical kernel descriptor approach achieved competitive re-sults and improved efficiency on several public datasets compared with hierarchical kernel descriptor method.(4) Finally, this dissertation presents a patch-level representation under supervised, called supervised efficient kernel descriptors (SEKD). Recently, unsupervised learning approaches have been employed to design patch-level features based on the similarities of image patches. These approaches, such as kernel descriptor and efficient kernel descriptor, have shown superior performance than pre-defined im-age features in object recognition. They gave a kernel generalization of orientation histograms and suggested a promising way to ’grow-up’ features based on available information. A major limitation of these approaches is patch similarities are not di-rectly linked to object categories. Therefore, a supervised approach to learning patch-level features that take into account image class labels is in urgent need. So, we achieve this goal by proposing SEKD, in which incomplete Cholesky decompo-sition is performed jointly with image class label in feature learning.Experimental results on several well-known image classification benchmarks suggest supervised efficient kernel descriptors are more compact and have superior discriminative pow- er than previous unsupervised feature descriptors.
Keywords/Search Tags:Patch-level feature, Image-level feature, Local feature, Global fea- ture, Feature learning, Scene classification, Object recognition, Bag-of- Words, Spatial correlogram, Co-occurrences codewords, Spatial pyra- mid, Kernel descriptor
PDF Full Text Request
Related items