Font Size: a A A

Research Of Image Recognition Techniques Based On The Semi-supervised Clustering And Generalized Distance Function Learning

Posted on:2012-06-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:H GuFull Text:PDF
GTID:1228330371955696Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Effective and precise image recognition (IR) techniques are crucial in the video search, image search and the robot applications. There are several reseach branches in IR such as object detection, image classification, content based image retrieval and semantic image annotation. Recent studies in the literature have shown that the key problem in IR is to build the machine learning models with good generalization. This thesis focuses on two machine learning problems:the semi-supervised clustering and the distance function learning. We deeply explore the performance of our methods for real applications in image classification and retrieval.The size of labelled sample image set is relatively small comparing to the whole data space. Thus, the machine learning for image recognition is typically a small sample size problem. The image representation is high dimensional and complex after the feature extraction. It’s hard to get a satisfied result only by the unsupervised clustering methods, without any supervision. Multi-sphere support vector clustering (MSVC) is an unsupervised clustering method that solves the clustering problem in the high dimensinoal feature space. It is superior for nonlinearly seperatable dataset. Therefore, it is beneficial to create a semi-supervised clustering algorithm based on MSVC for solving complex image clustering problems.In another perspective, the distance metric learning is effective to improve the performance of image classification and retrieval systems. However, current distance learning methods for the feature bags lost the statistical information and lack of the examplar selection mechanism. Thus, we provide the generalized image distance functions (GIDF) and the learning methods. The semi-supervised clustering method mentioned early is then used as the examplar selection algorithm. Besides the application of multi-classification problems, the online learning of the GIDF can solve the relevance feedback problem for the multi-features image retrieval system.Specifically speaking, the main work of this paper is as follows:(1) We apply the relative comparison to the support vector data description (SVDD) and present a noval semi-supervised support vector clustering method names RCS-MSVC. All data points are mapped into the feature space by kernel function and the clusters are learned using a k-means like iterative algorithm. The compactness of clusters and relative constraints are considered in the feature space. The method takes advantage of handling the complex, nonlinearly seperatable datasets.(2) We propose a RCS-MSVC preprocessing based image indexing method (RM-INDEX) for image retrieval. The method gives two image-cluster similarity functions to solve the ranking problem in the hierarchical RCS-MSVC clustering. We explore the system performance with various parameters and distance ranking functions in detail. Experimental results show that our method consistently improve the system performance for various distance ranking functions.(3) Current image distances between two feature bags lack of the statistical property. Thus, we define three different types of generalized image distance functions (GIDF) to overcome this problem:an image distance function under full constraints and two image distance functions under multiple instance constraints. The learning algorithm for each distance function is provided. In image classification, the method learns a local distance function for each examplar image and use Adaboost to train the final strong classifier for image classification problems.(4) The RCS-MSVC preprocessing based GIDF learning framework (RM-PREC-GIDF) is proposed for image multi-classification problem. The method uses support vectors in RCS-MSVC as examplar images while dividing the global classifier into series of local classifiers. The classification of the new sample is then only relevance with the distance functions in neighbor clusters. It greatly improves the learning efficiency.(5) Focusing on the high-quality license plate detection problem, we design a classifier based on RM-PREC-GIDF and implement a prototype system. The system extracts features for MSER like regions, including color histograms, the vertical projection histogram and the horizontal projection histogram. The regions are classified by the multi-classifier learned by RM-PREC-GIDF. The method overcomes the limitation of target license plate region sizes and is highly applicable for plate recognition applications.(6) We study the relevance feedback model in image retrieval systems for the practical e-commerce database. This method learns the weighting online based on the GIDF. To support the proposed feedback learning model and the distributed computing, we design a light multiple features based cocurrent image retrieval framework VU-Server. Experimental results show that it can fully support the retrieval tasks for ten millions of images.
Keywords/Search Tags:image classification, image retrieval, image clustering, support vector data description, one-class SVM, semi-supervised clustering, generalized image distance function, metric learning, Adaboost, license plate localization, relevance feedback
PDF Full Text Request
Related items