Font Size: a A A

Application Research Of Convolutional Neural Network In Cross-media Retrieval

Posted on:2020-06-02Degree:MasterType:Thesis
Country:ChinaCandidate:B B DuanFull Text:PDF
GTID:2428330578452884Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rise and development of network information technology,there has been a large amount of data in people's daily lives,and these data exist with various modalities.Faced with these diverse and huge amounts of data,how to find their relationship with each other in these data is an urgent problem and a huge challenge in the field of cross-media retrieval research.In cross-media retrieval research,the focus of this paper is to realize the mutual retrieval of text and images,and design a cross-media retrieval method based on deep convolutional neural network VGGNet.The work of this paper mainly includes:(1)Briefly summarize the basic characteristics of cross-media data,summarize common cross-media retrieval methods,and analyze the characteristics of various existing retrieval methods.(2)It is proposed to integrate the deep convolutional neural network structure into cross-media retrieval research.The pre-trained VGGNet on ImageNet dataset is used as the feature extractor of the image to extract the deep visual features of the image in the target dataset.At the same time,the traditional LDA model is used to obtain the potential topic probability distribution features of the text in the target data set,so that two heterogeneous underlying feature spaces are obtained,and then combined with the multi-class logistic regression model to map them into the isomorphic high-level semantic space.Then,the central correlation metric method is used to calculate the similarity between the image and the text in the space,and the search is performed according to the similarity degree,and finally the experimental result is evaluated by mAP.The experimental results show that the deep visual features have stronger image content representation ability than the traditional visual features,which can improve the retrieval efficiency more effectively.(3)Based on the research of the former method,in order to better characterize the image content of the target dataset,a fine-tuning of VGGNet is proposed.At the same time,an improved regularity method is proposed for the characteristics of deep visual features.This regularization is reflected in two aspects.First,when fine-tuning the convolutional neural network on the target data set,regularization is used to weaken the over-fitting phenomenon.Second,because the text features have strong semantic discriminating ability,and the distribution characteristics of the image visual features are disordered,the correspondence between the visual features of the image and the features of the text in the high-level semantics is utilized.At the same time,based on the characteristics of deep visual features,the semantic features of the images are uniformly regularized in the high-level semantic space using text semantic features,which effectively improves the semantic representation ability of the visual features of the image.Experiments show that the improved method can improve the retrieval efficiency.
Keywords/Search Tags:Cross-media retrieval, Convolutional neural network, Deep visual features, Regularization
PDF Full Text Request
Related items