Font Size: a A A

Research On Key Issues Of Image High Level Semantic Understanding

Posted on:2020-08-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y W DuFull Text:PDF
GTID:2428330596976529Subject:Engineering
Abstract/Summary:PDF Full Text Request
Since the 21 st century,massive visual information relies on interpreting digital image acquisition.How to make the computer quickly and accurately understand the scenes,actions and natural language collation and expression in digital images is a problem that needs to be solved.Based on deep learning,we designed a network model that can understand image semantics by using Convolutional Neural Networks(CNN)fuse Recursive Autoencoder(RAE)or Long Short-Term Memory(LSTM)network.The CNN-RAE model in both models is reliable in image scene prediction applications.We introduce information divergence into the model to measure the degree of approximation of the feature distribution of the latent semantic distribution.We use evaluation criteria and comparative experiments to verify the feasibility and reliability of two neural network models which is composed of CNN and LSTM in the application of high-level semantic understanding of images.The main works of this thesis are listed as follows:(1)We designed a network model for the joint application of CNN and RAE.The experimental results verify that the model can have reliable performance in image scene prediction,but it also demonstrates the shortcomings of the model in the application of high-level semantic understanding.(2)We set the information divergence as the loss function for the distribution of images and text information.We designed a network model with two-layer LSTM network,and designed and completed a comparative experiment according to two evaluation criteria,called SPICE and SGA.Finally,the results of the comparative experiment were analyzed.(3)By improving the network model in(2),we constructed a new characterization network of hierarchical transformation models to approximate image feature distribution and latent semantic distribution.The most similar latent semantic distribution describes the semantics in the image.The model can automatically locate the image features,select important features,and establish a joint distribution of images to predict the high-level semantics of the image.(4)We train the model in(3)with a few fine-tuning techniques.We designed and completed the comparative experiment according to the evaluation standard BLEU,and objectively evaluated the model through the experimental results.
Keywords/Search Tags:Deep learning, Information divergence, Image high-level semantic understanding, Long Short-Term Memory Network
PDF Full Text Request
Related items