Font Size: a A A

Research On Information Retrieval Based On Cross-modal Association Analysis

Posted on:2021-04-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y J DengFull Text:PDF
GTID:2428330623968136Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid growth of image,text,sound,video and other multi-modal data,people's increasingly strong demand for diversified retrieval.And cross-modal retrieval is widely concerned.However,there are heterogeneity differences of different modes.It is still challenging to find the content similarity of heterogeneous data.At the same time,under the background of big data,network data has the characteristics of large amount of data and high representation dimension.How to achieve accurate and efficient retrieval has become an urgent problem.We take advantage of deep learning and hash learning methods to study the modeling and matching of cross-modal retrieval.The main contents are as follows:(1)Deep feature extraction and expression.The heterogeneity of different modal data makes it difficult to measure the similarity directly.Humans and computers understand text and images differently at a high level of semantics,so there is a semantic gap.The text feature extraction based on the LSTM_CNN model and the image feature extraction based on the deep convolutional neural network.We obtain the feature representation with more semantic information.(2)A cross-modal retrieval model based on depth correlation analysis is proposed.Different modes contain different amounts of information when describing the same semantics,and it is difficult to extract local fine-grained correlation features.Therefore,a cross-modal collaborative attention network model based on deep learning is proposed.Based on deep neural network,this model constructs more fine-grained feature representation of multi-modal data.With the help of attention mechanism,the subtle interaction between text and image is captured and the fine-grained correlation between modes is constructed.(3)A cross-modal retrieval matching method based on hash learning is proposed.In order to solve the problem of large storage space and slow retrieval speed in large data sets,we propose a deep supervised discrete hash model based on hash learning.We study the efficiency and quality of cross-modal retrieval matching.Feature learning and hash code learning are unified in the same framework,while feature representation and hash function are learned.In the learning process,the similarity matrix is used as the supervision information to maintain the consistency between and within the modes.By learning the underlying semantic matrix of tags,we construct the correlation between labels and hash codes.The quality of hash codes is ensured and the efficiency of crossmodal retrieval is improved by learning hash codes directly from labels.In the optimization process,quantization loss is reduced by keeping discrete constraints(4)A cross-modal retrieval system is designed and implemented.The browser/server architecture is adopted to achieve two cross-modal retrieval functions.Two cross-modal retrieval functions are included: image retrieval text and text retrieval image.The proposed deep supervised discrete hash model is used to store and retrieve cross-modal data,and an application demonstration system are used to visualize the results.
Keywords/Search Tags:Cross-modal Retrieval, Deep Correlation Analysis, Feature Extraction, Hash Learning
PDF Full Text Request
Related items