Font Size: a A A

Research On Cross-Media Retrieval Method Based On Convolutional Neural Network And Correlation

Posted on:2021-02-11Degree:MasterType:Thesis
Country:ChinaCandidate:S ZhaoFull Text:PDF
GTID:2428330605956124Subject:Engineering
Abstract/Summary:PDF Full Text Request
In the post-Internet era,data has become complex and numerous.In order to meet people's demand for information diversification,multimedia technology has become the focus of many scholars' research.In the face of these massive data of different modalities,the improvement of the accuracy of cross-media retrieval technology can make people's understanding of things more comprehensive and profound.In cross-media retrieval technology,the research object of this paper is mainly the mutual retrieval between images and texts.For the traditional convolutional neural network model in the process of image feature extraction,it takes a lot of time and calculation,and in the case of a small data set,the image feature cannot be extracted well,and the Inception V3 model is improved.Using Incpetion V3's pre-trained model on ImageNet,initialize all the weights,remove the topmost fully connected layer of the model,and then treat the remaining hierarchy of the model as a fixed feature extractor to extract the image data used in this article set features,the extracted features are used to train a classifier of 10 classes in this paper through the softmax function.Directly use the classifier as a feature extractor for images to perform cross-media retrieval.Inception V3 convolutional neural network model trained by transfer learning to extract image features is named Inception V3_TL.For the problem of the heterogeneity of the two modes of image and text in the underlying space,the combination of Canonical Correlation Analysis and multinomial logistic regression can not well excavate complex correlation problems in cross-media data.Based on Inception V3_TL,by improving the correlation method,a PLM method combining partial least squares algorithm and multi-class logistic regression is proposed.It minimizes the loss of semantic information and improves the accuracy of cross-media retrieval.The experimental results on the Wikipedia dataset show that the Inception V3_TL + PLM method proposed in this paper can better mine complex correlations in cross-media data,and the accuracy rate is greatly improved compared with other research methods.
Keywords/Search Tags:Cross-media retrieval, Convolutional Neural Network, Correlation, Inception V3, Partial least squares algorithm
PDF Full Text Request
Related items