Deep Learning-based Fine-grained Cross-media Retrieval

Posted on:2022-01-28

Degree:Master

Type:Thesis

Country:China

Candidate:J M Bai

Full Text:PDF

GTID:2518306752496994

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the development of the Internet,the forms of web data are rapidly increasing,including images,text,video and audio.And the people's demands for cross-media retrieval have become more and more flexible.At present,the research of cross-media retrieval has attracted the attention of many scholars,but the existing cross-media retrieval mainly focuses on coarse-grained cross-media retrieval,and there is less research on the fine-grained crossmedia retrieval,which cannot meet the actual application requirements.As a new research direction,fine-grained cross-media retrieval not only faces the problem of the media gap,but also needs to consider the problems of small differences between subcategories and large differences within subcategories in fine-grained directions.In order to solve these two main problems,this article has carried out in-depth research on fine-grained cross-media retrieval.The main innovations of this article are as follows:(1)Aiming at most of the existing cross-media retrieval methods often ignore the finegrained features of the data,we propose a multi-model network for fine-grained cross-media retrieval method(MMNT).In this method,the proprietary network is designed for different media and a common network is also designed for four types of media,thus taking into account the proprietary and public features of different media.Based on this model,it is possible to learn the correlation between different media while learning the specific attributes of the media,thereby effectively improving the accuracy of cross-media retrieval.(2)Aiming at most of the existing fine-grained cross-media retrieval methods often ignore the semantic information expressed by the text,we propose a deep supervision and feature fusion for fine-grained cross-media retrieval method(DSFF).This method uses the label information and semantic information of the data to learn the correlation between different media features in the label space and semantic space through a deep supervision network,and minimizes the classification loss,discriminant loss and triple loss to eliminate media gaps while retaining the differences in samples of different semantic categories.In addition,this method is based on the combination of label features and semantic features to measure the similarity,which further improves the performance of cross-media retrieval.(3)Aiming at the difficulty in accurately extracting fine-grained features,we propose an attention mechanism and modal dependence for fine-grained cross-media retrieval method(AMMD).The method introduces an attention mechanism and relies on image data as the intermediate medium to deeply explore the potential relationships within the same media data and between different media data.In addition,the method also proposes a key frame-based video denoising analysis method,which obtains a clean data set through the method of sample selection,which improves the accuracy of cross-media retrieval.

Keywords/Search Tags:

fine-grained, cross-media retrieval, media gap, feature fusion, attention mechanism

PDF Full Text Request

Related items

1	Research On A Fine-grained Cross-media Retrieval Method Based On Adversarial Networks
2	Research On Multi-modal And Multi-Grained Network Based Cross-Media Retrieval
3	Research And Application On Fine-Grained Image Classification Based On Bilinear Model
4	Fine Grained Image Classification Based On Multi-scale Feature Fusion And Attention Mechanism
5	Research On Fine-Grained Image Classification Method Based On Feature Fusion
6	The Fine-Grained Retrieval Of Sketches Based On Deep Learning And Related Research
7	Research And Implementation Of Fine-grained Image Classification Method For Weakly Supervised Attention
8	Fine-grained Image Classification Based On Convolutional Neural Network
9	Research On Fine-grained Image Classification Based On Deep Residual Network
10	Multi-layer Weight-Aware Bilinear Pooling And Attention Mechanism For Fine-Grained Image Classification