Font Size: a A A

Research On Cross-media Retrieval Method Based On Compression Convolutional Neural Networks

Posted on:2021-04-08Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y SunFull Text:PDF
GTID:2428330605961326Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet and multimedia technology,various forms of media content(text,images,audio,video)on the network springing up and explosive growth,how to make the computer can rapidly and accurately understand the contents of multimedia information and the link between the different modal media information in order to realize the cross-media retrieval,become the key for people to quickly find the effective information they need from massive amounts of multimedia information.In recent years,deep learning has developed rapidly in the field of information retrieval.Because of its powerful functions,it has also been increasingly used in the research of cross-media retrieval.Due to the large number of parameters and the large amount of calculation of the mature deep neural network model,the speed of information retrieval across media will be limited accordingly,which directly affects the user's experience of information retrieval.This paper briefly analyzes the current situation of cross-media research and summarizes the existing cross-media retrieval methods.Based on the comprehensive analysis of the existing cross-media retrieval methods,the two-way retrieval of images and text is used as the entry point.For the problem of large calculation amount of deep convolutional neural network model,it is proposed to use compressed convolutional neural networks for images and text Cross-media search method for bidirectional search has achieved good results.The main work of this article is as follows:(1)Respectively adopt the pre-training Channel Pruning VGG-16 compression convolutional neural network and Channel pruning VGG-16 fine-tuned on the target to extract the low-level features of the images from the dataset,using Dirichlet(Latent Dirichlet Allocation,LDA)text topic model to extract the texts topics features from the dataset,and then express images and texts on a higher level of abstraction,send the"heterogeneous" features of the low-level features of images and texts topics features to multi-class logistic regression models for classification training to obtain the image category probability feature vector and text category probability feature vector in the same semantic space.Then the text category probability feature vector is used to regularize the image category probability feature vector to make the image features more have the ability to distinguish semantically.Finally,the de-averaging cosine similarity measurement algorithm is used to calculate the similarity between the image category probability feature vector and the text category probability feature vector,and the mean average precision(MAP)is calculated based on the similarity size matrix to evaluate the experimental results.The verification on different data sets shows that the application of compressed convolutional neural network to the bidirectional retrieval of images and text can improve the retrieval speed on the basis of ensuring the accuracy of retrieval results.(2)Based on the above algorithm,the similarity measurement method based on the quantity product is used to calculate the similarity between the image feature vector and the text feature vector,which further improves the accuracy of the bidirectional retrieval results of the image and text.The Scalar Product of two vectors not only represents the Angle between two vectors,but also the projection of one vector on the other direction,taking into account both the difference in the direction between the two vectors and the difference in the magnitude of the two vectors.Therefore,the Scalar Product is used to calculate the similarity between the image feature vector and the text feature vector.The experimental results on different data sets show that using the similarity measurement algorithm of the Scalar Product to calculate the similarity between the image feature vector and the text feature vector can further improve the retrieval accuracy.
Keywords/Search Tags:Cross-media retrieval, Compressed convolutional neural network, Scalar Product, Similarity measure
PDF Full Text Request
Related items