Research And Application Of Image Captioning Based On Deep Neural Network

Posted on:2022-03-14

Degree:Master

Type:Thesis

Country:China

Candidate:C Y Wang

Full Text:PDF

GTID:2518306350494734

Subject:Control Science and Engineering

Abstract/Summary:

PDF Full Text Request

Image captioning is to imitate the thinking mode of human beings,analyze the characteristic information of input image,and generate a text sequence describing the content of input image.At present,image captioning algorithms emerge in endlessly and have achieved good prediction results,but there are still some problems,such as the prediction results of the model do not conform to the real situation,the structure of the model is too complex,which is not convenient for practical application.Therefore,two image description models are proposed in this paper.The first is the image captioning model based on BDR-GRU(Img-bdr GRU);the second is the coder decoder network model based on visual guidance(VG-ED).The first image captioning model Img-bdr GRU mainly uses the idea of depth residual,and designs a new residual GRU model.In order to improve the information content of the text,a bidirectional recurrent neural network model is used,and then a BDR-GRU network model is constructed by combining the bidirectional and the depth residual.Finally,the final image captioning model img BDR-GRU is formed by combining the model with convolution neural network.The second image captioning model VG-ED mainly uses the idea of attention mechanism,and designs a new F-LSTM model integrating image information and text information.Although the previous attention mechanism has good prediction quality,it brings high computational cost,which is not conducive to the practical application of network model.Therefore,this paper improves the attention mechanism by using the global image feature vector information output by convolution part,and uses the adaptive attention mechanism to judge the use of image information according to the actual situation at each moment.By combining the newly designed model with convolutional neural network,the final image captioning model VG-ED is constructed.The experimental results show that the prediction quality of the two image captioning models is improved,and the CIDEr index of the second image captioning model is higher,which indicates that the model is closer to the real image description.In order to establish an intelligent real-time monitoring system which meets the actual needs,this paper proposes a solution based on image captioning algorithm,Firstly,the function of virtualizing GPU resources is realized,which enables multiple image captioning algorithms to use GPU resources at the same time.Then,the real-time video streaming server is used to transmit the real-time video frame data to the image captioning algorithm server.Through the algorithm server,the data is analyzed and a piece of text describing the scene is generated.Finally,the text is processed by voice broadcast system to generate audio and broadcast.

Keywords/Search Tags:

Image Captioning, Deep Neural Network, Real-time Video Broadcast System, Residual Network, Attention Mechanism

PDF Full Text Request

Related items

1	Research On Image Captioning Generation Based On Faster R-CNN And Visual Attention
2	Image Captioning Based On Deep Recurrent Convlution Network And Spatio-temporal Information Fusion
3	Research On Image Captioning Algorithm Based On Deep Learning
4	Researches On Short Video Captioning Based On Deep Learning
5	Research Of The Image Captioning Based On Unsupervised Method
6	Research On Video Captioning Of Abnormal Events Based On Attention Mechanism
7	Image Chinese Captioning Model Based On Deep Learning
8	Research On Image Captioning Method Based On Deep Neural Networks And Adaptive Attention Mechanism
9	Research On Semantic Guiding Video Captioning Methods With Attention Mechanism And Memory Network
10	Research And Optimization Of Image Denoising Algorithm Based On Deep Neural Network