Font Size: a A A

Active Learning Based Visual Scene Understanding

Posted on:2012-11-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:T Z YaoFull Text:PDF
GTID:1228330371456283Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Vision based scene understanding can help and improve the ability of analyzing and cognizing complex indoor and outdoor scenes. It is one of the research hotspots in computer vision. Scene understanding can be divided into local and global understanding based on different semantic levels. The former focuses on analyzing and describing the distributions and categories of the local regions in the image, such as multi-class object recognition and localization. The latter is concentrated on understanding the global characteristics of the scene, such as scene classification. Both local and global scene understanding can strengthen the understanding of the unknown environments for the computer in different cognition levels. They have wide prospect in many application domains such as intelligent surveillance, image and video retrieval and mobile robot navigation and also have important research values and significance. Because visual information is susceptible to influences such as view, scale, background interference and occlusion, the existing vision based scene understanding techniques still remain many problems in accuracy and robustness. This paper make researches on three hotspots in both local and global scene understanding domains and also combining active learning in the machine learning step to reduce the number of manual labeled samples and alleviate the burden of classifier training. The content contribution of our paper is as follows:This paper focuses on a new SVM based multiple view active learning algorithm. Active learning has the ability of reducing the complexity of samples, so it obtains more and more attentions and developments in scene understanding domains such as object recognition and scene classification. SVM has good performance in classification in small number of samples and multiple view learning can reduce the classification error compared with single view learning. In this paper, our algorithm makes two improvements based on traditional multi-view active learning algorithm Co-Testing in both hypothesis generation and sampling strategy. Firstly, we use SVM as base classifier in each view and apply sequential Adaboost to the multiple view active learning framework which can improve the robustness of hypothesis. Secondly, we present an adaptive hierarchical sampling competition strategy. In this mechanism, when the number if contention unlabeled samples is large, unsupervised spectral clustering is used to obtain the coase distribution of these samples in the feature space and then classification uncertainty and redundancy measures are both incorporated to obtain effective multi-view sampling by solving quadratic programming. The experiments prove that our proposed active learning algorithm has better convergence and classification performance compared with several state-of-the-art algorithms.How to jointly recognize and segment different class of objects from a static image is one of the important research domains in local scene understanding and until now still much work has to be done. Due to lack of motion information, the performance of joint object recognition and segmentation is usually susceptible to the influences such as view, lighting and occlusion. This paper proposed a new joint multi-class object recognition and segmentation algorithm. Firstly, we use three different unsupervised image segmentation methods to generate multiple segmentation description of the objects in order to strengthen the spatial relationship between pixels which belong to objects. Then multi-view active learning classifier is built to classify these segments and object information from different segmentation maps are weighted based on segment homogeneity estimation to reduce the uncertainty of object classification in segment level. Finally, we incorporate both local object information from segment level and global object information from image level into hierarchical conditional random field and inference the graph model to simultaneously realize spatial object segmentation smoothing and semantic context optimization.This paper presented a new unstructured road segmentation algorithm. Obtaining accurate road regions from the image is quite valuable in many practical domains such as autonomous driving and mobile robot navigation. Separating the road regions from the non-road regions in complex unstructured environments is more challenging than in traditional structured environments. Therefore, this paper proposed a top-down road segmentation method which is based on both region and boundary information from the image. Pixel level classification based method use self-supervised online learning to improve the adaption to the variation of the road scene while markov random field is applied to spatially smoothing the pixel based classification results. Boundary constraint based method can obtain the robust road boundary description by efficiently estimating the position of the road vanishing point. We use multi-view active learning based global road classifiers to predict the road model and then adaptively select the optimal road segmentation method based on it. In addition, our algorithm incorporates the temporal smoothing mechanism to both improve the results of pixel level classification and road model prediction which can enhance boost the robustness of the segmentation performance.Scene classification helps the computer accurately understand different types of complex environments, so it can play an important role in many domains such as context based image retrieval and autonomous driving. In this paper, we presented a new scene classification algorithm. In the multi-view active learning framework, we use three different views to create hypothesis. Firstly, we apply multiple object detectors to obtain object response features of the scene. Then spatial relationship sets are mined which can better strengthen the spatial relevance between different local regions of the image. Finally, we also proposed a cascaded online Latent Dirichlet Allocation topic model framework. Stochastic gradient descent based online LDA model can better adapt the large scale learning and the spatial pyramid based joint weighted topic modeling can obtain more accurate topic descriptions. Experimental results demonstrate that those three features can effectively depict the high level image semantics in different views compared with traditional low-level image features and converge faster than traditional single view active learning algorithm based on stochastic view combination which also obtains better performance in scene classification accuray.
Keywords/Search Tags:Scene Understanding, Multi-view Active Learning, Joint Object Recognition and Segmentation, Road Segmentation, Scene Classification
PDF Full Text Request
Related items