Font Size: a A A

Deep Structure Based Image Content Analysis And Its Application

Posted on:2013-06-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:K Y YangFull Text:PDF
GTID:1228330377951724Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
The long-held dream to enable the computer to see as humans, and the urgent needs of automatic image management and retrieval with the explosive growth of image data, bring the image content analysis to be a hot research topic in recent years. Image features play the fundamental role in image content analysis, and they are the summary of the appearance property of some image structures. For example, color histogram is the summary of pixels’color, bag-of-visual-words model is the summary of the texture property of image patches.However, the total number of image structures is very huge, it is crucial to select a subset from them to construct the image feature. Existing image features are mainly based on low-level structures (pixels, corners, blobs etc.). Low-level structures are with limited variance and can be easily detected by manually designed rules. But low-level structures can reveal little semantic information; also it is the cause of the semantic gap between low-level features and high-level semantic concepts.In contrast to low-level structures, deep structures are with more semantic information (e.g., face, wheel, leg etc.), and suitable for analyzing image content at semantic level.Deep structures are with large variances and cannot be detected by manually designing rules.The detection models are often learned from finely labeled training data. However, labeling the training data is both laborious and time costly. In this work, we proposed several algorithms to learn the model of deep structures using the training data with weak supervision, and analyze the image content based on deep structures. The main contributions are summarized as follows:1. Based on the fact that some categories are with similar deep structures, we proposed to assemble new object detector from auxiliary object detectors with the guidance of few positive examples.2. We proposed a Lazy Diverse Density algorithm to extract deep structures from social images which are with user provided tags.Based on each tag’s corresponding deep structures, we expand each tag with six visual properties and generated the detailed description of the image content. 3. We proposed a Semantic Point Detector to detect semantic representative image patches. In essential, it is a binary image patch classifier according to the semanticrepresentativeness.To avoid the labeling cost, we derive the image patch classifier from the image classifier which is trained from the image-level labeled data.4. We proposed a multi-layer learning algorithm to obtain the deep structure models from weakly labeled data. In the first layer, we learn the image classifier based on an improved version of bag-of-visual-words image repre-sentation. Then the classifier is used to label the positive images in detail and generate the foreground regions as the training data for the second layer. In the second layer, a dense matching based region similarity and unsuper-vised clustering are used to define the parts and their positive examples. After that, initial model for each part is learned to start the latent SVM to further refine it.The research of image deep structures is closely related to computer vision, machine learning, information retrieval, artificial intelligence and cognitive science etc. Hope the ideas and results in this work can provide some insights to other disciplines.
Keywords/Search Tags:Deep structure, semantic region, semantic point detector, PartBook
PDF Full Text Request
Related items