Font Size: a A A

Deep Learning Image Retrieval Based On Social Tag And Salient Region

Posted on:2020-08-10Degree:MasterType:Thesis
Country:ChinaCandidate:J WangFull Text:PDF
GTID:2428330590482227Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of deep learning,most of the current mainstream image retrieval methods use deep neural networks to extract the features of images,and have achieved remarkable results.However,these methods use the manual tags and all the pixel information of the images to train the deep neural networks,which has the following drawbacks:(1)The use of manual tags not only requires a lot of manpower and time to complete the annotation work,and the artificially set tag sets cannot describe the fine-grained semantic information of the image,and cannot provide high-quality tag data for the deep networks,which will affect the extraction results of image features.(2)The whole image is used for the training of deep networks,which not only incorporates a lot of background information unrelated to the retrieval target that greatly increases the computational load and reduces the representation ability of image features to the retrieval target,but also pays too much attention to the global semantic information of the image,ignore the description of the local details of the image,and cannot effectively define the image containing multiple entities which makes the retrieval results unsatisfactory.In response to the above problems,this paper proposes a Deep Learning Image Retrieval based on Social Tag and Salient Region(STSRDLIR).The main features of the method are as follows:(1)Filtering non-visual representative tags.The social tags are filtered using the "cohesive" and "dispersive" distance strategies to remove tags that are not related to the visual content of the image.(2)Extracting social tag for salient region.Firstly,the salient regions of the image are extracted to remove the background image unrelated to the retrieval target.Then,the social tags are vectorized twice,so that the semantically similar social tags obtain the same vector representation;finally,a social tag vector for each salient region is extracted to provide high quality image data and tag data for deep networks training.(3)The structure design of deep networks.Input: The similarities and differences of the social tag vectors are used as the basis for judging whether the salient regions are similar,and the triple of salient regions is constructed such that the first two salient regions are similar,the third salient region and the first two salient regions are not similar,and the triple of salient regions is input to the deep networks for training;Network Structure: VGGNet(Visual Geometry Group Net)deep networks are used as the basic model and its structure is optimized;Objective Function: The objective function based on the triple of salient regions is designed to guide the parameter optimization of the deep networks,so that the generated feature vectors can inherit the semantic similarity of the salient regions well.Parameter Training: combining with transfer learning to train network parameters in order to improve the generalization ability of the model and generate high-level semantic features of salient regions with strong representation capabilities.(4)Image hash retrieval based on salient regions.The feature vectors of the salient regions extracted by the deep networks are hashed to improve the retrieval speed and save storage space.The hashlist of the image generated by the hash code of the salient regions is stored in the database.Similar images are returned by calculating the Hamming distance between the hash code of the image to be checked and the hash code in the database and converting Hamming distance sorting to image sorting.In this paper,NUS-WIDE dataset is used for experiment.By comparing with advanced algorithms such as BRE,MLH,KSH,BRE-CNN and MLH-CNN,it is proved that STSRDLIR can not only overcome the shortcomings of current mainstream retrieval methods,but also extract the high-level semantic features of images accurately,and obtain the ideal image retrieval results,which is superior to the current mainstream methods.
Keywords/Search Tags:Social Tag, Salient Region, Deep Learning, Feature Extraction, Image Retrieval
PDF Full Text Request
Related items