Font Size: a A A

Image Caption Research Using Recurrent Neural Network

Posted on:2018-07-20Degree:MasterType:Thesis
Country:ChinaCandidate:Q J LiaoFull Text:PDF
GTID:2348330533966705Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the rapid development of computer technology,there are increasingly mass data on the Internet which contain both images and texts.For example,news and corresponding illustrations,video images and subtitles,as well as messages uploaded by users.In order to better utilize data that contain both images and text,an algorithm is required to understand the relationship between the image content and its textual description.Therefore,in the research field of the union of computer vision and natual language generation,image caption has become a key task.The process of image caption is,given an image as input,outputing a correct sentence of the description of the image content by analyzing its visual information.Image caption is a difficult problem,algorithm that combines traditional image feature extraction method and language model in the past doesn't come with satisfactory result.With the development of deep learning model,mothod that base on recurrent neural network and deep convolution network has make a breakthrough,but there are still many aspects that need to be improved.On the base of studying of the image caption model that combine recurrent neural network and deep convolution network,this paper mainly contributes:1?Based on the recurrent neural network and deep convolution network,an improved image caption model which call WICN model is proposed.In view of the shortcomings of the existing methods,the WICN model proposes a method to detect the words that represent the main concepts of the images,and combines the idea of using the retrieval algorithm to enhance the detection accuracy.Based on the detection method,combined with the use of image features,WICN model has imporved the performance of image caption model significantly.2?Many existing image cpation research is based on the English,this paper also study the application of image caption in Chinese.According to the different characteristics of Chinese and English languages,this paper presents an improved image caption model based on Chinese unified coding,which avoids the error introduced by using the word segmentation algorithm.In order to solve the problem of word disorder in unified coding,this paper furtherproposes an improved post word segmentation algorithm which combines the n-grams model to limit the output of the recurrent neural network.In this paper,the WICN model is validated and compared with other methods on the Flickr8 k,Flickr30k and Pascal VOC 2008 database which proves that the WICN model has a significant improvement in solving the image caption problem.On the Chinese Flickr8 k database,this paper compares the different solutions of Chinese image caption,and verifies the idea of the post word segmentation algorithm proposed in this paper,which proves that the method has a significant improvement in solving the problem of Chinese image caption.
Keywords/Search Tags:Image Caption, Chinese Image Caption, Recurrent Neural Network, Deep Convolution Neural Network
PDF Full Text Request
Related items