Font Size: a A A

Research On Image Description Method Based On Region Correlation And Attention

Posted on:2020-01-01Degree:MasterType:Thesis
Country:ChinaCandidate:J Z LiFull Text:PDF
GTID:2428330578450928Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In the past decade or so,the emergence of large-scale training data sets and high-performance computer hardware has led to the rapid development of deep learning technology,which has been widely used in many fields.With the rapid development of the Internet and the popularity of camera equipment,the number of pictures in the network has grown geometrically,and it has been impossible to identify the picture content by human resources alone.Therefore,how to let the computer automatically describe an image is a research hotspot in the field of image understanding.This task involves two areas of artificial intelligence,computer vision and natural language processing.The basic principle is to extract information from the image to be described,identify the characters,perceive the content of the scene and the relationship between the characters,and finally use a logical language express it.In this paper,the research and development of image description methods at home and abroad in recent years are studied.It is found that the traditional image description method ignores the interdependence between objects in the image in the process of extracting image feature information.In this paper,an image feature extraction optimization method based on region correlation is proposed.VGG and RPN are combined to perform image feature extraction and candidate region selection.The spatial distance between it and all other regions is calculated for each candidate region.As a measure of the overall correlation between the region and other regions,based on the metric value,the candidate region corresponding to the feature map is weighted and optimized,and finally the weighted optimized image feature map is input as input to the language module to participate in the text generate.Then,this paper studies the application of Attention mechanism in image description tasks.The traditional Attention focuses on the spatial position of the image,but ignores the attention to the semantic information.This paper proposes an improved semantic Attention mechanism,which does not require additional semantic information extraction operations.The Attention mechanism acts on the image feature map.In the process of image feature extraction based on CNN,the image feature map output by convolutional layer is generated by multi-channel image superposition,which contains multiple levels of abstract semantic information.Therefore,this paper proposes a channel layer for image feature map.Attention focused on the formation of a semantic Attention.The language generation module is based on the NIC model and incorporates Attention.The introduced Attention module can focus on image space and semantic information at the same time.The weighted and optimized image feature map is used as the input of the language module to generate the image description text together.Finally,the experiment was carried out and compared with previous studies.The image description training of the model is based on the MS COCO data set.The experimental results show that the image description method based on region correlation and Attention proposed in this paper can significantly improve the quality of image description.
Keywords/Search Tags:regional correlation, image feature extraction, neural network, semantic attention, Long Short-Term Memory
PDF Full Text Request
Related items