Font Size: a A A

Research On Cross-media Retrieval Methods Based On Deep Learning

Posted on:2022-01-18Degree:MasterType:Thesis
Country:ChinaCandidate:C Y LuFull Text:PDF
GTID:2518306533495214Subject:Electronic information
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology and multimedia technology,a large amount of multimedia data is coming out.However,how to obtain the real useful information in these kinds of data is a difficult problem.Facing this trend of information diversification,cross media retrieval technology emerged.Compared with traditional single media retrieval,cross media retrieval can meet the needs of people in the era of multi-source data.However,there is a natural heterogeneous gap between different media,which brings great challenges to the realization of cross media retrieval.But multimedia data has the characteristics of lowlevel differences and high-level semantic relevance.That is,there are differences in the forms of different multimedia data,but the semantic expression is the same,which provides the basic theoretical basis for the realization of cross media retrieval tasks.This paper aims to achieve reliable and efficient cross media retrieval method by using the relevant theoretical methods of deep learning.The main research work includes:(1)In view of the problem that most methods ignore the information of Media Association,a cross media residual attention network is proposed.The network can extract the internal features of the image and text media,and also can mining the correlation features between multimedia data.Meanwhile,the network uses the attention mechanism to find out the key parts of media data,and provides accurate internal characteristics of media data for relevance learning.In order to solve the migration problem in the representation mapping process,a cross media joint loss function based on classification loss and semantic loss is designed.The joint loss function uses Island loss function as semantic loss function and cross entropy loss function as classification loss function.By combining loss function,the network can restrict the difference within the categories on the basis of ensuring that different categories can be separated,so as to improve the accuracy of representation distribution.(2)In order to solve the problem of insufficient network feature extraction ability in the above methods,a two-level network is proposed.The network uses many different neural networks to form two-level network model,and uses the underlying network to extract the image and text features,and the top-level network mining the relationship between media features.Through this special network structure,the feature extraction ability of the model can be improved.In order to further enhance the ability of network association learning,a loss function of cross media association is proposed.The loss function uses the thought of attention mechanism for reference,and uses the learned feature attention weight to guide the projection mapping process of the network,so as to improve the distribution accuracy of data representation.
Keywords/Search Tags:cross media retrieval, deep neural network, attention mechanism, joint loss function
PDF Full Text Request
Related items