Font Size: a A A

Visual Question Answering Based On Dynamic Parameter Memory Network And High Level Concepts

Posted on:2019-08-17Degree:MasterType:Thesis
Country:ChinaCandidate:S W LiFull Text:PDF
GTID:2428330548994031Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Visual Question Answering(VQA)is a leading cross-cutting task in the field of Computer Vision(CV)and Natural Language Processing(NLP).There are several main advantages of the current method.Through the combination of Convolution Neural Network(CNN)and Recurrent Neural Network(RNN),to achieve VQA.The Dynamic Memory Network(DMN)is a typical representative and a higher score in various VQA tasks by adding a memory and attention mechanism to improve the reasoning ability of the VQA in the neural network architecture.In the CNN by adding dynamic parameters,and the weight of the dynamic parameters based on the problem of adaptive determination of the method,but also in the task re-achieved good results.However,the CNN-RNN combination method does not explicitly represent the high level semantic concept,but rather attempts to progress directly from the image feature to the text.The DMN method,which supports the fact that the fact is not marked during training,fails to provide a strong results of the question and answer,but the effect is proved by combining the dynamic parameters.In view of the shortcomings of the above algorithms,this paper proposes a VQA algorithm based on dynamic parameter memory network and high level concepts.The algorithm first uses a separate parameter prediction network for adaptive parameter prediction,which includes the problem as input a Gated Recurrent Unit(GRU),and a fully connected layer that generates a set of candidate weights as its output.The parameter prediction network of the CNN fully connected dynamic parameter layer is constructed by combining the hash technology.The candidate weights given by the parameter prediction network are selected by using the predefined hash function to determine the weights in the dynamic parameter layer.Then,by referring to the DMN architecture method,the attention mechanism is added to CNN,and the modified GRU is added to the memory mechanism and the RNN is replaced.Construct a composite architecture of CNN-GRU.Finally,the concepts of high level integration to build a successful CNN-GRU method.The experimental results show that the method has made significant progress in VQA.Several good results are achieved on several benchmark data sets of VQA.
Keywords/Search Tags:Dynamic Parameter Memory Network, High Level Semantic Concepts, Visual Question Answering, Convolution Neural Network, Gated Recurrent Unit
PDF Full Text Request
Related items