Font Size: a A A

Machine Learning Based Object Recognition

Posted on:2014-12-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:G C LiuFull Text:PDF
GTID:1268330422954169Subject:Machine learning and computer vision
Abstract/Summary:PDF Full Text Request
Computer vision is one of the core problems of artificial intelligence. Its ultimategoal is to make computers own the visual ability of human, i.e., to see and interpret thevisual scenes in the way of human. Computer vision has wide applications in medi-cal, industry, military and aerospace etc. However, as it has been known that humanvision occupies at least60percentage of human brain, it is generally accepted that com-puter vision may be an “AI-complete” problem, or is at least an “AI-difcult” problem.Among the various vision problems, the problem of classifying the objects picturedby images into classes, so called as object recognition, is one of the most fundamentalproblems. It is a very challenge problem and is also one crucial bottleneck that blocksthe advance of many important applications such as image search. Although this prob-lem have being explored for many years by the world’s most competitive academiessuch as MIT, Stanford, Yale, Cambridge and Princeton, the problem is still not wellsolved. However, with the viewpoint of machine learning, object recognition is feasi-ble, at least to some extend. Namely, it is possible to implement an practicable objectrecognition system that fits the requirements of real applications, provided that onecould appropriately extract the features from images, appropriately represent the ob-jects, appropriately represent the object class, and establish an appropriate mechanismto classify the objects.In this thesis, we firstly introduce a prototype of a machine learning based ob-ject recognition system, which is consisting of an object segmentation sub-system, anobject representation sub-system and a classifier. We devise novel algorithms to es-tablish these sub-systems, including an HGM-based object segmentation method, anobject representation approach named RRFD and a classifier named NCC. In order toimprove the performance of the object recognition system, we propose the models ofLRR, LLT and Feedback Embedding for image clustering, multi-label classificationand fast similarity search, respectively. To be precise, the innovations of this paper include:We propose HGM (hybrid graph model) for semi-supervised data clustering. Tothe best of our knowledge, we are the first to introduce the hybrid graph intomachine learning. Based on HGM, we devise an efcient and efective systemfor automatically segment objects without annotated training images. This au-tomatic object segmentation approach makes our object recognition system bemore appealing.We propose a new feature descriptor based on the Radon transform, called as theRRFD. Given the images with objects being separated from backgrounds, RRFDconverts the objects to a feature vector that encodes the shape, texture and colorof the objects. Moreover, RRFD can be also taken as a general feature descriptorto generate feature vectors for an arbitrary image region.To recognize object categories, we need to classify the feature vectors into theirrespective classes. Based on a neural coding hypothesis, we devise a new classi-fication algorithm, called as the NCC. In comparison with the widely used SVMmethod, NCC performs much better in handling the data with diferent training-testing distributions. While the testing data is sampled from the same distributionas the training data, NCC also slightly outperforms SVM.When a single image can contain objects of multiple classes, the classificationproblem becomes a MLC (Multi-Label Classification) problem. We propose anovel mechanism, called as the LLT, for defining the loss functions in regressionframeworks. Based on the well established SVR framework, we implement anefective MOR (Multi-Output Regression) algorithm, called as the LLT-SVR.LLT-SVR also provides an efective way for multi-label classification. So it canextend our system from single object class to multiple ones.In order to improve the practicability of the object recognition system, we needa mechanism to group images into their respective topics. We establish the cri-terion of low rankness and propose a new method named LRR (Low-Rank Rep-resentation). To the best of our knowledge, we are the first to introduce thelow-rank criterion into machine learning. Based on LRR, we have established an efective algorithm for image clustering.In order to achieve fast recognition in large-scale database, we devised a new se-mantic hashing indexing structure. The core of this structure is a new dimension-ality reduction algorithm, called as the FE (Feedback Embedding). Comparingto previous methods such as LLE (Locally Linear Embedding), FE provides amore convincing mechanism for dimensionality reduction.Besides the object recognition and some corresponding machine learning problems, inthis article we also explore some essential issues of science. For example, we try toanswer the question of how human brains process visual signals. Namely, we makea new neural coding hypothesis that reveals the reconstruction mechanism in humanbrain.
Keywords/Search Tags:object recognition, machine learning, hybrid graph model, radon repre-sentation, locally linear transformation, neural coding classifier, low-rankrepresentation, feedback embedding
PDF Full Text Request
Related items