Font Size: a A A

Scenes Parsing Research

Posted on:2017-07-19Degree:MasterType:Thesis
Country:ChinaCandidate:C H MaFull Text:PDF
GTID:2348330488482498Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In computer vision, scenes parsing is very important and difficult issue on image understanding and pattern recognition. It is a task of described the original image information to a higher level of information and expression whole image object information efficiently. Joint traditional detection, segmentation and multiple label identification, to solve the scene image semantic label. Because of the image by the different light intensity and the material covered, multi-category, increases the difficulty of scenes parsing. So it is two crucially questions how to generate a good internal visual information and the effective use of context information to ensure that the labelling of consistency in scenes parsing.In recent years, as deep learning and image segmentation technology has obtained the remarkable achievements in the field of computer vision, and get wide attention in scenes parsing. This paper, based on the two kinds of technology made by some of the exploration and results, the main work is as follows:1. Present a measure of multi-scale deep network base on deep learning, while rely on a supervised model. Unlike traditional multi-scale method, aim to produce good internal representations of the visual information and use contextual information efficiently, taking the pixels in the image as the center to select two dimension image patch. The model contain two deep convolutional network: first, the low-level features of the network is utilized to extract the large-scale of global image features, and then through local refine network capture the local characteristics of small scale image patch. Combining with the characteristics form the two networks which get a set of densely and completely image features, produces a powerful representation that captures texture, color and contextual information. In predict scene annotation information, for the high efficiency, present a fast image patch annotation method, to take advantage of 3 x 4 grid image uniform patch, prediction probability of each image patch, select the category of the maximum probability as this patch label. Contrary to most standard approaches, multi-scale deep network does not rely on segmentation technique nor any task specific features, and get good performance on the Stanford Background Dataset.2. Based on multi-scale image segmentation and effective matching kernel for scene parsing, including feature extraction, image segmentation, and support vector machine. In feature extraction, kernel descriptor is used to describe the extraction of effective low-level image features, and capture the image gradient, color and shape of the three characteristics. Due to the direct use of image features make the dimension is higher, in order to reduce the computational cost, combining efficient match kernel method is more accurate to measure similarity between the local features, the transfer and merger kernel descriptor extracted feature descriptor to each of the super-pixels, effectively solve the high feature dimension. In image segmentation part, the multi-scale image segmentation algorithm, resampling and alignment effectively reduce the image of local information redundancy and the complexity of image processing, directly joint linear support vector machine for scene annotation information. On Stanford Background Dataset can annotation significant object on the scene images effectively.
Keywords/Search Tags:Scenes parsing, Deep learning, Supervised learning model, Super-pixels
PDF Full Text Request
Related items