Font Size: a A A

Research On Key Technologies For Chinese Image Captioning

Posted on:2018-04-02Degree:MasterType:Thesis
Country:ChinaCandidate:X F ZengFull Text:PDF
GTID:2428330623950957Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Natural language processing and computer vision are important domains in artificial intelligence.And image captioning is one of the key tasks of intersection of two fields,being one of the most popular researches and having very useful practical application.The research of English image captioning has been deep and there are abundant corresponding datasets.However,the study of Chinese image captioning is very few,and the corresponding dataset is very rare,and the construction of the dataset is time-consuming and laborious.This paper uses the abundant English information and the transfer learning to solve the problem of the Chinese image captioning.The first research is the model of image and text feature fusion Chinese image captioning(FF-CIC model).The rich English data contains a lot of information,and the relationship between English and Chinese can be established through the translation of English and Chinese and so does the image-English-Chinese relationship.This paper innovatively proposes the model of Chinese image captioning based on feature fusion to solve the problem of Chinese image captioning.The model can effectively use the imageEnglish-Chinese relationship.First the model extracts the features of the image and the corresponding English description,and then incorporates two features in different ways,finally generates Chinese description on the basis of the fused feature.The fusion methods of image feature and English description feature mainly are two: concating and weighting.The experimental results show that the concating fusion Chinese image captioning model can improve the scores of BLEU-1,BLEU-2,BLEU-3 and BLEU-4 by 11.4%,8.5%,5.4% and 1.5% respectively,while weighting fusion Chinese image captioning model can increase the scores of BLEU-1,BLEU-2,BLEU-3 and BLEU-4 by 7.9%,5.1%,2.6% and0.3%,when compared with the Chinese image captioning which does not use the English information.At the same time,image feature and Chinese description feature are fused for English image captioning,and image feature and English description feature are fused for Japanese simage captioning,proving the extensibility of the model and exploring the relationship between different languages.The second research is transfer learning Chinese image captioning model(TL-CIC model).Because native speakers of different languages have similar attention to the same image,the generated descriptions are similar.That is,people good at different languages have similar understanding of images.Therefore,we can use English transfer learning to Chinese image captioning tasks.In addition,it is not easy to obtain the corresponding relationship among images,English and Chinese,and the amount of image-English data and one of image-Chinese data are unequal.So sometimes it is better to use the relationship between image and English and the relationship between image and Chinese respectively.This paper proposes the TL-CIC model to solve the problem of Chinese image captioning.The model has two training stages.The the first one is to use the image-English data to train the image processing module and save it.The second one is to use image-Chinese data to train the Chinese generation module and retrain the image processing module.The experimental results show that the model can improve the scores of BLEU-1,BLEU-2,BLEU-3 and BLEU-4 by 4.6%,2.9%,1.3% and 0.3% respectively when compared with the Chinese image captioning which does not use the English information.At the same time,Chinese transfer learning to English image captioning tasks and Japanese transfer learning to Chinese image captioning tasks are used to prove the extensibility of the model.
Keywords/Search Tags:Chinese, Image Captioning, Image and Text Feature Fusion, Transfer Learning
PDF Full Text Request
Related items