Font Size: a A A

Research On Image Semantics Understanding In Chinese Based On Deep Learning

Posted on:2019-10-06Degree:MasterType:Thesis
Country:ChinaCandidate:Q L ZhaoFull Text:PDF
GTID:2428330542482335Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The task of image semantic understanding is to understand a given image and describe it into some language.It is a relatively comprehensive task of computer vision and natural language processing.Compared with previous tasks,it requires not only the identification of the key objects in the picture,but also the understanding of the relationships among them,and even includes some abstract concepts,as semantic information,and at last express them in a relatively fluent language.With the rise of machine translation and big data,there has been a wave of research on image understanding at home and abroad,but the current situation is that the accuracy and completeness of the Chinese description of images is still low,coupled with the particularity of Chinese words,the task of Chinese semantics understanding has not made a very good progress.We use the encoding-decoding network framework in this article,our system can better solve the problem on the dataset provided by the "AI Challenger · Global AI Challenge" in 2017.The main tasks are:?Analysis and selection of image feature extraction networks.Compared with the existing traditional manual features,the current depth features are more advantageous.In this paper,depth features are used as image feature codes,the VGG and ResNet Network features are compared and analyzed in experiments.?Implementation of Chinese annotation data preprocessing and encoding.Due to the peculiarity of Chinese,the data needs to be cleaned,word segmented,and word embedded before the input into neural network.First,we clean-up data through the relevant rules defined from experiments and analysis.For the cleaned data,bi-directional LSTM and CRF were selected for word segmentation.The accuracy of word segmentation reached 96.8%.Finally,the word vector embedding is done using the Skip-gram model library.?The definition and implementation of Chinese semantic generation model,that is,the definition and implementation of the Chinese semantic generation model.Mainly the image depth features obtained above and the word vectors obtained after preprocessing are sent to an automatic encoding-decoding network for training,and the finally generate the Chinese semantic expression.Our main frame Automatic encoding-decoding network mainly adopts the LSTM network structure,and we add a doubly soft-attention mechanism to the network,which can make the Chinese semantic description generated of images more abundant and specific.The main reason is that the doubly soft-attention mechanism can focus on the specific region corresponding to the text in the time stamp of the LSTM thus to generate a more specific and accurate description.This article mainly adopts TensorFlow framework and Python language to realize the above-mentioned research algorithm,and carries on a large number of experimental analysis to verify the validity and practicability of the algorithm.Due to the limitations of the experimental environment,only part of the data in the challenge dataset was used for training and testing.There are 100,000 training samples,30000 verification samples,and 30000 test samples.The evaluation of experimental performance includes three evaluation criteria:BLIU?CIDEr and ROUGE.The results show that the task of image Chinese semantics understanding can be achieved high performance and the description has better accuracy and integrity.
Keywords/Search Tags:Deep Learning, Chinese semantic understanding, automatic encoding-decoding network, data preprocessing, soft attention mechanism
PDF Full Text Request
Related items