Research On Image-text Retrieval Method Based On Deep Learning

Posted on:2021-09-25

Degree:Master

Type:Thesis

Country:China

Candidate:H Li

Full Text:PDF

GTID:2518306548494084

Subject:Control Science and Engineering

Abstract/Summary:

PDF Full Text Request

In this era of rapid development of computers and communications,people are exposed to more and more multimedia information such as text,video,audio,including images.Through the Internet,people are step by step to achieve global multimedia information sharing.Users' query of multimedia information has also become more and more common.A variety of new application requirements have followed.Cross-media retrieval technology refers to a multimedia retrieval method that can flexibly span between different modalities,that is,to retrieve samples of other modalities related to it through an instance of one modal.Such search results are rich in content and can present query objects to users more three-dimensionally.This paper focuses on cross-retrieval between image and text modalities.The deep learning model is used to extract the feature expressions of images and texts in the data set,and the two are mapped into a high-dimensional subspace.The similarity between the two modal samples is measured according to the distance in the subspace to complete the retrieval.This paper proposes a multi-level feature extraction method and a bi-semantic space construction method.In the feature extraction stage,image and text features that are conducive to fusion are extracted.In the feature fusion stage,the real semantic space and transformed semantic space are constructed for each modal for comprehensive retrieval,which effectively improves the retrieval efficiency.The work and research results of this paper mainly include the following aspects:(1)Aiming at the problem of semantic alignment in image retrieval problem,this paper improves the feature extraction part of the existing retrieval model,and proposes a multi-level key semantic information extraction method.The retrieval method is mainly composed of three modules: The first module adds dilated convolution to the VGG network and Text-CNN to obtain multi-level features of images and text.The second module is to achieve semantic alignment through feature selection and combination through attention mechanism and outer product.The third module is to fuse the two modalities and map them to a common subspace for retrieval.(2)This paper proposes a dual semantic space retrieval model.In current feature fusion networks,the objective functions include classification tasks and fusion tasks.Since the feature space of each modality needs to be classified,the function distribution of other modalities must also be considered,which will lead to the loss of accuracy and the failure to fit the function distribution in the finally learned feature space.This will affect the cross-modal search results.This paper first builds a real semantic space,that is,a complete semantic space that has a good effect on identifying single-modal labels.Then,a transformation semantic space is constructed.The transformation semantic space is a bridge between two modal real semantic spaces,with its own modal semantics and function distribution of the modalities to be retrieved.The two modalities compare the transformed spatial feature of the modal with the real spatial feature of the other modal,calculate the similarity,synthesize the results,and complete the retrieval.

Keywords/Search Tags:

Deep learning, Convolutional neural network, Cross-media retrieval, Feature extraction, Feature fusion

PDF Full Text Request

Related items

1	Research On Image Retrieval And Image Semantic Feature Based On Deep Learning
2	Research On Multi-Scale Fusion Cross Modal Retrieval Based On Deep Learning
3	Research On Image Retrieval Based On Deep Convolutional Neural Network
4	Three-dimensional Fingerprintrecognition By Using Deep Learning
5	Gait Recognition Research Based On Feature Fusion Convolutional Neural Network
6	Application Research Of Convolution Feature Selection In Image Retrieval
7	Research On Novel Retrieval Techniques For Fashion Media Data
8	Research On Fabric Image Retrieval Based On Convolutional Neural Network
9	APP Similar Icon Retrieval System Based On Deep Learning
10	Image Feature Extraction Based On Deep Learning