Font Size: a A A

Research And Application Of Image Caption Technology

Posted on:2021-01-07Degree:MasterType:Thesis
Country:ChinaCandidate:X Z LinFull Text:PDF
GTID:2392330605451182Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Image Caption is a technology connecting computer vision and natural language processing.The practical value of Image Caption becomes more important with its improvement.Image Caption is also the hotspot and difficulty in the field of artificial intelligence.Ships are the main body of marine activities.Mastering the information of ships in time is a basic technical link of national strategy in Chinese marine.There are two main directions of intelligent processing in remote sensing image of ships:(1)ship recognition based on computer vision,such as ship automatic recognition based on deep learning and so on,which could get accurate type of ship;(2)ship semantic information generation based on Image Caption,which could transform unstructured remote sensing image to structured sentence,and describe the motion state,surrounding environment,scene and other semantic information.This essay pays attention to the generation of ship semantic information based on Image Caption.Firstly,a new Image Caption model named Image Caption Based on Visual Features and Mixed Attention,is proposed and applied to the application of ship supervision.The main work is as follows:(1)In view of the shortcomings of Image Caption Based on Spatial Attention,such as the insufficient extraction of image features and the unsmooth generation of sentences,a model named Image Caption Based on Visual Features and Mixed Attention is proposed.Firstly,the visual feature extraction network is used to improve the feature extraction;Secondly,the LSTM with visual selection mechanism is designed to improve the fluency of generated sentences;Thirdly,the multi-spatial feature matching is used to recalibrate the generation of spatial attention;Fourthly,the visual information and language information are integrated with mixed attention;Finally,the objective is optimized through reshaping,so that it improves the overall quality of the generated sentences.In order to demonstrate and verify the Image Caption Based on Visual Features and Mixed Attention in this essay,two kinds of experiments were carried out: 1)Experiment 1,the results of Image Caption Based on Visual Features and Mixed Attention on COCO Caption test dataset were submitted to the official server for evaluation.The results showed that: the CIDEr-D score could reach 1.225,the BLEU-4 score could reach 0.336,and the ROUGE score could reach 0.579,which was better than other Image Caption models based on attention mechnism;2)Experiment 2,The experimental results based on self-built ship dataset of Image Caption showed that it could meet the requirements of semantic information extraction and improve the quality of generated sentences.(2)In order to solve the problem of marine ship supervision,and meet the application requirements of unattended video surveillance system for automatic generation of ship information,a method of automatic generation of ship information with Image Caption Based on Visual Features and Mixed Attention is proposed.Firstly,the redundancy of video stream is removed by key frame extraction;Secondly,the ship in the key frame is detected and recognized by the ship recognition network;Thirdly,the Image Caption Based on Visual Features and Mixed Attention generates the motion state of the ship,the surrounding scene and other text information in the key frame;Finally,the information of ship recognition result,time,location,ship motion state and surrounding scene is summarized,so that the automatic generation of ship information is realized.In order to demonstrate and verify the method of automatic generation of ship information in this essay,key frame extraction,ship recognition network and Image Caption Based on Visual Features and Mixed Attention are deployed on the embedded platform named Jetson AGX Xavier,and the modules are integrated with script file.The prototype platform of video surveillance based on the Image Caption of ship is designed and implemented by using the tools of Nginx,Python and Flask.The results showed that the automatic generation method of ship information in this essay could meet the requirements of real-time,accuracy and reliability.
Keywords/Search Tags:image caption, ship supervision, convolution neural network, long short-term memory
PDF Full Text Request
Related items