Research On Image Caption Method Based On Attention Mechanism

Posted on:2021-06-03

Degree:Master

Type:Thesis

Country:China

Candidate:R Y Wei

Full Text:PDF

GTID:2568306110960229

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

With the emergence of high-performance computer hardware,deep learning technology has developed rapidly.The increasingly shaped interconnected world of everything has produced a torrent-like amount of data.In the face of the exponentially increasing amount of image data,how to make computers replace humans to efficiently understand image content has become one of the research hotspots.Image caption is to detect and identify image content information,not only to perceive the type of scene,but also to identify object attributes and their relations,and the ultimate goal is to generate a natural language that can reasonably describe the image content.As one of the key tasks in the intersection of natural language processing and computer vision,image caption has not only research value but also practical value.Combining deep learning technology has become the main method to solve this task.Although the research on image caption tasks has made breakthroughs repeatedly,There are still problems that the generated sentences are not clear on the details of the image and deviate from human understanding.The subject research is based on deep learning image caption model,the specific work is as follows:1.An image caption model based on attention feature adaptive recalibration was proposed.On the basis of image feature fusion attention mechanism,a channel activation layer was constructed to fully capture channel-wise dependencies for attention feature adaptive recalibration,which boosted the representational power of the feature,and ultimately improved the quality of generated sentences by long short-term memory.A comparison experiment was conducted on the three standard data sets of MS COCO,Flickr8 k and Flickr30 k.The experiment results show that the scores of BLEU＿1,BLEU＿2,BLEU＿3,BLEU＿4,METEOR and CIDEr of the proposed model on MS COCO data set can achieve 69.4%,52.3%,38.6%,28.5%,23.3% and 83.6%,which is superior to the traditional neural network image caption model and can generate more accurate image caption.2.Based on the existing research work,the Chinese-oriented image caption task is realized,and the image caption model is optimized and improved.Aiming at the shortcomings of the long short-term memory that only consider the above information,a bidirectional long short-term memory was proposed as the language generation network of the image caption model,which could consider the characteristics of the above and below information at the same time and improve the generated image caption sentences.Meanwhile,at the stage of establishing Chinese vocabularies of different sizes,a speeding method for word segmentation was proposed,the method used Cython to implement the three core algorithms of word segmentation technology for word segmentation acceleration.A comparison experiment was conducted on the three standard data sets of ICC.The experiment results show that the method of word segmentation acceleration can improve the word segmentation speed by 63.9%,the image caption model with a vocabulary size of 8000 has the best performance,and uses a bidirectional long short-term memory can improve the performance of the image caption model.

Keywords/Search Tags:

image caption, deep learning, attention mechanism, multimode, natural language processing, chinese word segmentation

PDF Full Text Request

Related items

1	Research On Chinese Word Segmentation Based On Deep Learning
2	Research Of Image Automatically Caption Algorithm Based On Deep Learning
3	Image Caption Generation Based On Attention Mechanism
4	Research On Chinese Word Segmentation Based On Deep Learning
5	Study On Chinese Word Segmentation Based On Recurrent Neural Network Language Model
6	Research On Image Caption Generation Based On Deep Learning
7	Applied Study On Chinese Word Segmentation Based On Deep Learning
8	Research Of Chinese Word Segmentation Based On Deep Learning
9	Chinese Word Segmentation Model Based On Improved Bidirectional LSTM-CRF
10	Research On Chinese Word Segmentation Methods Based On Deep Learning