Font Size: a A A

Research On Image Retrieval Methods Based On Visual Attention Model

Posted on:2019-01-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:L X YuFull Text:PDF
GTID:1368330545969094Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of network and multimedia technology,smart device application has become more and more popular,which produces a lot of multimedia data,especially,the image and video data account for a large proportion.The constant increasing of data scale makes existing image process methods meet many challenges.How to manage and utilize these data effectively has attracted wide attentions in both academia and industry.Content-based image retrieval by extracting the visual features contained in image itself to query image,which suitable for large-scale image data management and retrieval,and is an important technology in the field of the current information retrieval.The existing content-based image retrieval method is often used the low-level features such as color,texture and shape,to calculate the similarity between images.which cannot exactly extract the high-level semantic of image and also rarely consider the attention mechanism of human eye in obserning image.The human visual attention system has an ability to quickly identify potential target objects from complex scenes and focus on and mainly process them as regions of interest.This visual attention mechanism can obtain as much effective information as possible utilizing limited resources,which will be used in content-based image retrieval algorithm to improve the performance of image retrieval system.Single view feature usually represent only one aspect of the image information,which cannot describe image completely.Multiple view features contain more useful information than single one,simply connecting the features of different views will not only increase the complexity of the algorithm also lead to higher feature dimension,which is not beneficial to image retrieval.The features are extracted from different views reflect different properties of one same image,they contain much compatible and complementary information which can improve the performances of the traditional methods.Multi-view learning can construct new low-dimensional embedding feature by using the complementary information of different view features,which is an effective feature fusion method.Based on the framework of content based image retrieval,simulating human visual attention mechanism to observe and understand image,this thesis mainly analyzes and studies some methods of local structure descriptor feature,image saliency feature and multi-view learning fusion feature.This research constantly improves the system structure of image feature extraction algorithm and enhances ability of algorithm to describe the image feature,which improves the performances of the image retrival system.(1)The traditional local structure descriptor lacks of gradient information,and the process of image textons detecting and matching maybe generates overlapped features.In order to solve the problem,this thesis proposes a diagonal structure descriptor based on gradient change.Firstly,according to the visual attention mechanism and the color differences of diagonal pixels in local regions,five kinds of diagonal structure textons are defined.Secondly,these intermediate features of color,texture and shape are extracted by using the new framework of detecting and matching.Finally,the retrieval result is obtained by similarity comparison.Experimental results show that this method has higher retrieval precision compared with other several methods.(2)Due of the complexity and diversity of image local pixel changes,traditional local structure descriptor has some problem in representation for image features.To solve the problem,this thesis proposes a multi-trend binary code descriptor based on the characteristics of the human visual pre-attention process,which reveals the multiple change trends of local region pixels.Firstly,the input image is transformed to corresponding mapping subgraph by utilizing the new local structure descriptor.Based on it,the spatial correlation characteristics are obtained by using cooccurrence matrix.Then,the different intermediate features are transformed by weighted normalization strategy,and a new global image feature vector is generated,which incorporates some spatial information.Compared with other several methods,this method has better retrieval effect.(3)Visual saliency is an important role in the image analysis and understanding,but traditional methods have some problems in regional salient feature extraction and image pixels saliency computation.Based on visual attention model and connected granule concept,this thesis proposes an image feature extraction method based on adaptive fusion of object and background by computing the saliency of local regions.Firstly,a set new structure textons are defined.Then,connected granule concept is introduced in this method,and the connectivity and spatial distribution characteristics of the target are describe by using connected granule attributes.Finally,the image features are generated by utilizing adaptive vector fusion model.This method not only captures the global features of images,but also reflects the local details of images,which has strong target feature distinguishing ability and achieves better results in the retrieval experiment.(4)How to effectively simulate the perception of human vision for local image regions is an important problem.According to the characteristic of human visual receptive field and model,this thesis proposes a multi-level convolution saliency features based on weber's law for image retrieval.Firstly,the weber's law is utilized to generate the differential excitation image.Then,we utilize binary transformation and conbolution opertation to construct a multi-level saliency subgraph.Finally,to exploit spatial correlation information of image,the pair-wise correlation and hierarchy statistic model is constructed using a co-occurrence matrix.Experimental results are presented to illustrate the efficiency of this method.(5)Usually,single image feature only represent one aspect of the image information,which cannot describe image completely.Based on multi-view learning and spectral embedding methods,this thesis proposes an improved multi-view spectral embedding feature fusion method for image retrieval,which can obtain as much compatible and complementary information as possible from multiple features.Firstly the low dimensional embedding of each visual feature is obtained,then the best low dimension embedding feature is generated by constructing a new iterative optimization strategy.Experimental results show that this method achieves better retrieval performance.
Keywords/Search Tags:Image Retrieval, Visual Attention Model, Saliency Feature, Multi-view Spectral Embedding, Local Descriptor
PDF Full Text Request
Related items