Design And Implementation Of Cross-modal Retrieval For Images And Texts Based On Deep Learning And Hashing Methods

Posted on:2024-01-29

Degree:Master

Type:Thesis

Country:China

Candidate:Z L Luo

Full Text:PDF

GTID:2568306944958099

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the advent of the Internet era,the explosion of graphic and text data on the network has created an urgent need for efficient and accurate retrieval of the information people need from massive data.Currently,there are several problems and challenges in the field of graphic and text retrieval,such as high information redundancy or loss in text-based single-modal retrieval,high resource consumption and low retrieval efficiency in imagebased single-modal retrieval,and the inability to achieve sufficient"semantic alignment" and low retrieval efficiency in cross-modal retrieval algorithms.To address these issues and challenges,this paper primarily investigates the following topics:(1)Proposes a single-modal retrieval algorithm based on deep learning and hash methods,as well as a multi-modal data processing algorithm.For text data,an algorithm based on BERT and hash encoder is proposed to achieve large-scale and efficient text semantic retrieval.For image data,a mean hash algorithm based on grayscale value comparison is proposed,and combined with ElasticSearch technology to achieve largescale and efficient image accurate retrieval.In addition,this paper also proposes a multi-modal data processing solution for audio and video data to realize audio and video retrieval.(2)Proposes a cross-modal retrieval algorithm based on pre-trained models and encoders.A pre-trained model based on multi-path Transformer is proposed,which allows different modal data to fully interact and share information,thus achieving high-quality "semantic alignment".Based on this pre-trained model,a dual encoder and fusion encoder are constructed,with the dual encoder realizing "rough recall" and the fusion encoder realizing "accurate sorting",ultimately achieving efficient and accurate cross-modal retrieval.(3)Designs and implements a graphic and text cross-modal retrieval system based on deep learning and hash methods.This paper conducts layered design of the system,organically integrates the above single-modal retrieval algorithm and cross-modal retrieval algorithm,and makes reasonable use of middleware such as RocketMQ,Redis,and Nginx,ultimately realizing a high-efficiency,high-precision,and high-availability graphic and text cross-modal retrieval system.Finally,this system was applied in the national key R&D program"Research and Application Demonstration of the Winter Olympics Global Communication Platform".During the Winter Olympics,the system provided high-quality retrieval services to a large number of users and received high recognition from the Ministry of Science and Technology and the Winter Olympics Organizing Committee.

Keywords/Search Tags:

deep learning, hash coding algorithm, graphic and textual cross-modal retrieval

PDF Full Text Request

Related items

1	Research On Digital Library Cross Modal Retrieval Based On Deep Hash Learning
2	Design And Implementation Of A Cross-modal Retrieval System Based On Deep Hashing
3	Deep Network For Image-Text Cross-Modal Retrieval
4	Research On Cross-modal Hash Retrieval Algorithm Based On Unsupervised Learnin
5	Research On Cross-modal Retrieval And Recognition Of Visual And Text
6	Cross-modal Retrieval And Annotation Based On Hashing Learning Method
7	Cross-modal Retrieval Using Deep Neural Network
8	Multi-branch Cross-Modal Person Reidentification Algorithm With Fused Attention Hash Coding
9	Research On Deep Hashing Method And Security For Cross-Modal Retrieval
10	Learning To Hash For Large-scale Cross-modal Retrieval