Image-text Cross-modal Retrieval Based On Deep Hashing Method

Posted on:2019-04-25

Degree:Master

Type:Thesis

Country:China

Candidate:W N Yao

Full Text:PDF

GTID:2428330545472236

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

With the development of mobile Internet and the popularization of smart phones,digital cameras and other devices,the amount of multimedia data shows an explosive growth.In the field of information retrieval,the continuous growth of multimedia big data has brought the need for cross-modal retrieval.Cross-modal retrieval deals with the cases that the modality of query data and data to be retrieved is different,such as using image query to search text or video results.However,most of the current mainstream search engines,such as Baidu,Google and Bing,only provide search results based on one modality.In addition,as deep learning has achieved a series of breakthroughs in computer vision and natural language processing,combining multimedia big data with deep learning is the common development trend of the two fields.Therefore,taking new requirements and technologies into consideration to explore novel cross-modal retrieval models has become one of the challenges in information retrieval area.This paper mainly focus on image-text cross search tasks.Through in-depth analysis and comparison of existing methods,it is found that the hashing method has the advantages of high storage efficiency and fast retrieval speed in solving large-scale cross-modal retrieval problems.However,most of the current cross-modal hashing methods still adopt traditional hand-crafted features,and cannot make full use of the semantic information of the labels when processing multi-label data,resulting in unsatisfactory retrieval results.In view of the above defects,this paper propose a deep multi-level semantic hashing(DMSH)method for cross-modal retrieval to implement image-text cross-modal search.It solves the problem existed in most of the current cross-modal hashing methods that the rich semantic information embedded in class labels of multi-label data has not been fully utilized through the hash code learning process.The proposed DMSH integrates the strong ability of deep learning in feature extraction and representation and the efficiency of hashing in data storage and computation.Specifically,the main work of this paper includes:(1)A comprehensive survey on cross-modal retrieval,deep learning and hashing method is conducted,the research status and existing problems in these fields are analyzed in depth;(2)A similarity matrix based on label co-occurrence is proposed,which solves the problem that existing methods can not make full of the semantic information of labels which results in a low search accuracy;(3)By analyzing the characteristics of the existing network structure design of deep hashing methods,a unified framework that integrates deep feature extraction and hash code learning is proposed.Two different deep neural networks are used to extract the semantic features of images and texts,respectively.While at the output end,two subnetworks are associated through label semantic relations to achieve end-to-end learning;(4)The performance of proposed DMSH and several state-of-the-art cross-modal hashing methods including CCA?CMFH?STMH?SCM?SePH and DCMH are compared on the dataset MIRFlickr-25K;(5)The effects of 3 different convolutional neural networks,CNN-F,VGG-16 and ResNet-50,on the retrieval results are compared experimentally.Experiments show that the DMSH proposed in this paper is superior to the contrasted methods in image-text cross-modal retrieval task,and the retrieval result on CNN-F network is better than VGG-16 and ResNet-50.Based on this,further improvements can be made by exploring ways to better integrate label semantic information,discovering more semantic information,refining the text feature learning module,improving the network structure to learn better feature representations.

Keywords/Search Tags:

Cross-modal retrieval, Deep learning, Hashing method, Multi-label learning

PDF Full Text Request

Related items

1	Image-text Cross-modal Retrieval Based On Deep Hashing Method
2	Deep Label-based Hashing For Cross-modal Retrieval
3	Supervised Hierarchical Cross-modal Hashing
4	Cross-modal Retrieval And Annotation Based On Hashing Learning Method
5	Research On Key Technologies Of Deep Cross-Modal Hashing
6	Research On Cross-modal Retrieval Of Images And Texts Based On Deep Hashing Learning
7	Research On Single-modal And Cross-modal Retrieval By Hashing Technology
8	Technology Research And System Realization On Cross-Media Data Retrieval Based On Hashing Learning
9	Research On Multi-modal Multi-label Hashing Methods For Large Scale Data Search
10	Semantic Transfer Hashing Based On Deep Learning For Cross-modal Retrieval