Font Size: a A A

Visual Object Classification: Multiple Kernel Multiple Instance Learning

Posted on:2012-06-11Degree:MasterType:Thesis
Country:ChinaCandidate:M Y WangFull Text:PDF
GTID:2178330338491951Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Visual object classification is to classify visual objects or determine the category which the image belongs to automatically, locate and extract the region of interest in the image. This is a hot and difficult issue in the field of computer vision and pattern recognition, and has great significance to the field of the analysis and understanding for the image content. As in real world scenes, the visual objects may vary in viewpoint, brightness and scale; in addition, the number of images has been growing day and day, making the traditional manual object extraction becoming difficult. Therefore Machine Learning methods are introduced to classify and learn the semantic concept according to the low level visual feature of images, and build complex visual object classification model. Now the low-level visual features such as color, texture, shape and the spatial relationship are usually used to present the content of images. However, there exists huge semantic gap, which occurs between the low level features represented by computers and the high level semantic features understood by human.The research direction of this thesis is visual object classification. It is mainly to address the issue of traditional learning methods in tackling the manual extraction of visual object and the limited discriminative ability of bag of words model. This thesis improves the existing multiple instance learning methods. The main research contents of this thesis are described as follows.1. Multiple instance learning combined segmentation. Based on MILES algorithm, we propose a novel multiple instance learning approach which combines segmentation for object detection and extraction. This approach uses"Bag of Words"model. The whole image is regarded as a multiple instance bag. The visual words that represent the image are regarded as the instances in the bag. The approach maps each bag into a feature space defined by visual vocabulary via the histogram over visual words. Next, 1-norm SVM is applied to select important features as well as classify images simultaneously. Then we will classify instances coming from the bag classified as positive, and take the positive instances for object"seed"points. After that segmentation is combined to realize object extraction. Experiments on Caltech 101 dataset show that this approach achieves high efficiency.2. Multiple instance learning based visual phrase. Due to the limited descriptive and discriminative ability of bag of visual words and the problem that traditional learning methods may suffer from background clutters and large appearance variations. We propose a MVPL (Multiple Visual Phrase Learning) method for image classification. In MVPL, the visual phrase is first generated from over-segmented image regions of homogeneous appearance and visual words within each region, which may provide enhanced descriptive ability by introducing the spatial coherency. Then a devised MIL algorithm is applied to efficiently learn from the weakly labeled image data. The experiment results on benchmark dataset Caltech 101and Scene 15 show that our proposed method significantly outperforms the state-of-the-art algorithms about 9% and 7% respectively.3. Multiple kernel multiple instance learning. Visual object is often associated with multiple visual measurements If the object is represented by only one feature, the final classification result can be wrong when information is insufficient. MIL is a natural tool for processing the weakly labeled dataset and has high classification accuracy. However, there is only one feature vector that can be used to represent each instance in the bag. Therefore we propose a novel framework: Multiple Instance Multiple Kernel Learning (MIMKL), which figures out the combination problem of various features in MIL. This framework, which based on MIL, uses multiple features to describe the instance and compute combined kernel weights when training. It combines the advantages of multiple features and has high classification accuracy. The experiment results on benchmark dataset Caltech-101 show the efficiency of our proposed method.
Keywords/Search Tags:Visual Object Classification, Image Classification, Visual Phrase, Multiple Instance Learning, Multiple Kernel Learning, Multiple Kernel Multiple Instance Learning
PDF Full Text Request
Related items