Font Size: a A A

Research On Key Technology For Fine-grained Image Classification,Segmentation,Generation And Retrieval

Posted on:2018-09-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:B ZhaoFull Text:PDF
GTID:1318330542455067Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the fast development of computer technology and network information technology,people have more requirements for contents and want to get specific and customized service.For example,when they use image retrieval service,they expect the search engine can return the images according to the desired attributes;when using the image classification service,they not only want to know the coarse class of the object,but also its fine-grained species or models.Fine-grained image analysis is one of the possible solutions for such requirements.Fine-grained images are the main research targets of this thesis,and the author thoroughly in-vestigates the fine-grained image segmentation,classification,generation and retrieval prob-lem.The main contribution are in the following four folds.1.The author proposes a soft visual attention model for fine-grained image classification.Existing visual attention models do not consider the diversity of the attention maps and can only fine the similar attentive regions in the image,therefore the author adds con-straint for the attention canvas and design a new loss function to pursue the attention diversity.The Long-Shore Term Memory Networks are used to integrate the attentive representation at different time step and classify the fine-grained images.2.Considering the characteristics of clothing shopping images,the author proposes a co-segmentation algorithm to extract the common clothing regions of among the images.Since online clothing images usually contain models and the clothing items should be the visual salient regions,the author uses upper-body detection and co-salient detection to localize the clothing regions in different images.Then the gaussian mixture models of the foreground and background are estimated respectively.The author uses the gaussian mixture models to co-segment the images iteratively until getting the optimal results.3.Considering the requirement for viewing images with different views,the author pro-poses a new image generation algorithm which can generate multi-view images only according a single-view.By integrating the variational inference into the generative ad-versarial networks,our algorithm first generate the low-resolution image with the target view based on an input single-view image,which models the structure of the object.Then the high resolution image is generated using adversarial training.4.When using image-based search engine in e-commerce website,the query image may not always meet the user's mental model.The author proposes a new fashion search algorithm which can manipulate the attributes of the query image to totally meet the user's requirements.The author designs a memory block which stores all the prototype attribute representations.Specific attribute representations are read and updated within the memory and used to guide the attributes manipulation.Finally,the modified attribute representation is used to find the similar image with desired attributes.
Keywords/Search Tags:Deep Learning, Visual Attention, Co-segmentation, Memory Augmentation, Generative Adversarial Networks
PDF Full Text Request
Related items