Font Size: a A A

The Research And Application Of Image Captioning Based On Deep Learning

Posted on:2018-06-02Degree:MasterType:Thesis
Country:ChinaCandidate:D X ZhuFull Text:PDF
GTID:2348330533969241Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Artificial intelligence is an important direction that human beings have been exploring for a long time.How to make computers learn human's ability has a vital significance.Benefited from the improvement of computer parallel computing power and data explosive growth,many neural network algorithms came out,this kind of neural network algorithm usually has more network layer,so called deep neural network or deep learning algorithm.Deep learning algorithm have a surprising effectiveness for complex artificial intelligence tasks,and have been applied in many fields.The main research content of this paper is image caption algorithm and its application.The task is more complex because it is the overlapping of computer vision and natural language processing.In this paper,we will design and model the different parts of image caption task using deep learning algorithm,and apply the algorithm in the OCR task.For the image caption task,two algorithms of past-feeding and past-attention are proposed in this paper,each of them improved different neural network structure.The first past-feeding algorithm,by adding the information of the predicted word embedding as auxiliary,predict current output word.The second past-attention algorithm,by building the relationship of multiple moment attention vector,let the generation of attention vector is more reasonable.And the whole model is divided into two parts of the language information and image information,so that the model is more clear.This paper not only describes the general framework of the model,but also the detail of formulas derivation,finally visualize the process of image caption.From the visualization,we can clearly see how the algorithm extract the image feature and how the attention move when predicting every word in the caption sentence.The final experiment shows that the two proposed algorithms have different degrees of improvement under the different evaluating indicators.For the verification code image caption task,we propose OCR-IC algorithm to solve this problem in the perspective of image caption.And according to the characteristic of the verification code image,we finetune the network structure to adjust the problem.Compared with the traditional algorithm,the OCR-IC algorithm has many advantages,such as no manual operation of image segmentation,supporting variable length of verification code,high accuracy and so on.The final experiment shows that the OCR-IC algorithm have good accuracy in fixed length and variable length verification code.
Keywords/Search Tags:image caption, deep learning, OCR
PDF Full Text Request
Related items