Font Size: a A A

Research On Deep Learning Based Image Captioning

Posted on:2020-11-18Degree:MasterType:Thesis
Country:ChinaCandidate:T HuangFull Text:PDF
GTID:2428330575956506Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Image captioning is a hot topic in current artificial intelligence research.It links computer vision with natural language generation and automatically generates corresponding caption based on image content.Through the research of image captioning,the most common data types of image and text are put together,so that the computer could have the ability to perceive the visual world and express them in human language.Image is usually associated with text using an Encoder-Decoder structure.The input image is recognized and encoded into feature vector,and the caption is generated based on the image features.Based on the Encoder-Decoder structure,this paper focus on the global feature of image,explore the different levels of feature representation extracted from deep convolutional networks,the caption generation methods and corresponding feature fusion methods in captioning models.The main work of this thesis includes the following aspects:(1)From the perspective of image encoding,analyze and compare the application on captioning of image features from different depths in a deep convolutional network,from the aspects of expressiveness and ease of decoding.(2)From the perspective of caption generation,introduce convolutional structure and attention mechanism into caption generation respectively,explore the feature fusion method based on corresponding generation structure,and finally form two new captioning networks.(3)Analyze the extracting,encoding and decoding procedure of global image features,explore the application effects of different caption generation networks and feature fusion methods in caption generation networks,offer a basic reference to subsequent research on Encoder-Decoder based captioning.In this thesis,different methods are applied to construct the captioning network.From the experimental results,the caption generation network based on convolutional structure and attention mechanism and corresponding fusion methods are applicable to captioning,and can produce desired outcomes.
Keywords/Search Tags:deep learning, image captioning, image features, text generation
PDF Full Text Request
Related items