Research On Visual Question Answering Based On Multi-Channel CNN-LSTM

Posted on:2019-11-23

Degree:Master

Type:Thesis

Country:China

Candidate:H W Zhang

Full Text:PDF

GTID:2428330548474406

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Visual Question Answering is a multi-domain task of machine learning.It is considered as one of the most shining star of tomorrow in artificial intelligence.Visual question answering system is require to respond to a pair of arbitrary image and natural language question input,extract and then combine image feature and question feature,finally give the corresponding answer by reasoning.CNN is the most often used method to extract regional feature from image.However,CNN cannot supply distant dependency information.LSTM is often used to extract sequence feature of text,but LSTM cannot obtain regional dependency information.To combine the strength of these models,focusing on visual question answering task,we propose and implement a multi-channel convolutional neural network-long short memory(CNN-LSTM)model in this paper.The hybrid model can make use of convolutional neural network's ability of extract regional feature,as well as recurrent neural network's strength of processing a sequence.To prove the effectiveness of our model,we applied it to text classification task first.A neural network model,consisting of word embedding layer,multi-channel CNN-LSTM model,and fully connected layer,are used to solve Twitter sentiment analysis task of SemEval.After data retrieval,pre-processing,neural network training,and experiment with analysis,the proposed model turns out to be workable.For visual question answering task,image feature are represent by processed through a deep CNN network,and question feature are extract by multi-channel CNN-LSTM model.Features are then combined and used to predict answer.By comparing results of baseline model and model using multi-channel CNN-LSTM,we find that the new model can acquire both a higher accuracy and faster training process.This is also evidence for proving multi-channel CNN-LSTM model can extract needed feature effectively,and is ready for generalized usage even in different domain.

Keywords/Search Tags:

Visual question answering, Sentiment analysis, Convolutional neural network, Long short-term memory

PDF Full Text Request

Related items

1	Research On Affective Visual Question Answering
2	Sentiment Recognition For Short Annotated GIFs Using Visual-Textual Fusion
3	Research On Robot Question Answering System For Freshmen Register In Colleges And Universities
4	Research On Sentiment Polarity Analysis Of E-commerce Reviews Based On Deep Neural Network
5	Sentiment Analysis Of Short Text Based On Improved Bidirectional LSTM Neural Network
6	Research On Sentiment Analysis Of Chinese Text Based On Deep Learning
7	Design And Implementation Of An Epidemic Sentiment Analysis System Based On Deep Learning
8	Level And Aspect-term Sentiment Analysis Based On Deep Learning
9	Research On Sentiment Analysis Based On Deep Feature Representation
10	Research On Text Sentiment Analysis Algorithms Based On Deep Learning