Font Size: a A A

Research And System Implementation Of Image Caption Based On Deep Learning

Posted on:2021-08-03Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhangFull Text:PDF
GTID:2518306308469504Subject:Computer technology
Abstract/Summary:PDF Full Text Request
As the main carrier of information storage and query,the total amount of image data stored in the network increases exponentially every year.Therefore,how to search the key images efficiently by utilizing search engines has become a significant problem in image retrieval applications.As a matter of fact,there is no corresponding description label for a large number of picture information in the network.Therefore,in the field of image retrieval,it is important that utilizing the efficient algorithm to express unlabeled images and illustrate the accurate contents and expression to the users.In nowadays life,with the rapid development of the AI technology,the deep learning mechanism combined with large-scale GPU has been utilized in many applications,including but not limited to face recognition,machine translation and speech recognition.In this paper,we utilized the deep learning technique to solve the problem of image labeling.The deep learning model can understand the content of the image as well as extract the semantic information of it,therefore deep learning model can describe the relationship between the image content and semantic description.Then the deep learning model can generate understandable text to the users.Based on the neural network and attention mechanism,we in this paper propose a model to generate description,and we also develop an Chinese image retrieval system.The elaborate contributions are as follows:1.An image caption model is proposed.The model contains three fundamental components,including image coding,feature extraction and image decoding.Specifically,the ResNext network is used as the encoder of image data,and the feature extraction phase applies the channel and spatial attention mechanism to learn the channel and spatial weights of the image.The ONLSTM model is used in image decoding phase to realize the order and hierarchy of statements.In addition,the model also introduces the overcorrection method to solve the problem of overcorrection or error accumulation when the data distribution shows inconsistent between training set and test set.2.Based on the realization of the model,this system uses Java,Python,HTML and other programming languages to realize a Chinese image retrieval system on the web side and a WeChat service based on the back end.The use of Docker-package system,nginx,and multiple databases ensure the system performance.
Keywords/Search Tags:Image caption, Attention mechanism, Neural network, Chinese image retrieval
PDF Full Text Request
Related items