Font Size: a A A

Research On Cross-language Image-text Retrieval Method Based On Chinese-english Parallel Corpus

Posted on:2022-06-24Degree:MasterType:Thesis
Country:ChinaCandidate:G F FengFull Text:PDF
GTID:2518306533467134Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of mobile technologies,hundreds of millions of people are shooting and collecting images at anytime and anywhere through the mobile device terminals,and then sharing them with text in different language through social networks.This kind of information sharing method of mixed arrangement of pictures and texts makes the language information in the Internet not only as a carrier,but also combined with images to share rich media is a multimodal way to express language information.However,the existing cross-modal information retrieval methods mainly focus on digital images and English texts,while the studies on cross-modal information retrieval methods for Chinese and other languages is relatively backward,lacking effective training data.To solve this problem,this paper proposes a cross-modal information retrieval method based on Chinese-English parallel corpus,which constructs the relevance between digital image and Chinese text indirectly through English text.This work is outlined as follows:(1)Cross-language Text Feature Extraction based on Pre-trained Language Model:A Chinese-English Parallel Corpus is constructed for image scene description by taking the word frequency statistics and stop word filtering results of label text in MSCOCO dataset as keywords.The shortcomings of using One Hot,Word2 Vec to vectorize text word and WMD to measure text similarity are analyzed.The two-way encoder representation BERT based on transformer was proposed as the pre-trained model.Tokenizers are used to measure Chinese-English data in text preprocessing.Multi attention mechanism is used to extract English and Chinese text features respectively.And then the masked language model(MLM)is carried out.Finally,the semantic mapping of text features is realized based on convolutional neural network.The results of semantic mapping are analyzed by experiments.(2)Image Feature Extraction and Aggregation based on Self-Attention Mechanism:An image overlapping segmentation algorithm based on quadtree is designed,and the topological ranking of adaptive quadtree image blocks based on scale invariance is realized by depth first traversal algorithm.The whole feature of image is extracted by pre-trained convolution neural network,and the local feature extraction method of image blocks after overlapping is designed.A spatial aggregation algorithm based on self-attention mechanism for local features of image sub blocks is proposed,which takes into account the spatial relationship of image sub blocks after adaptive quadtree segmentation.The results of feature polymerization were analyzed by experiments.(3)Cross-language Image Matching Method Based on Chinese-English Parallel Corpus: A digital image-Chinese text information retrieval method based on Chinese English parallel corpus is proposed.After that,this paper designs a deep learning model for image text information retrieval,and proposes a comprehensive similarity evaluation function combined with comparative similarity measure and angle similarity measure.The similarity evaluation method of digital image,English text and Chinese text semantic features are discussed and mapped to the same semantic features.The deep learning model of Chinese English text information retrieval is optimized by the methods of small batch gradient descent and linear decline of learning rate.The experimental results of image-Chinese bidirectional retrieval are compared and analyzed.
Keywords/Search Tags:Chinese-English parallel corpus, semantic feature extraction, cross-modal, cross-language, information retrieval
PDF Full Text Request
Related items