Font Size: a A A

Research On Deep Attention Network Structure For Image QA

Posted on:2018-06-27Degree:MasterType:Thesis
Country:ChinaCandidate:T ZhangFull Text:PDF
GTID:2428330518458889Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Image QA,also known as visual image test,refers to a given image and a natural language to describe the problem,is the process which the computer can automatically per the image content to make the appropriate answer.Image QA is one of the important research contents in the field of computer vision.With the development of artificial intelligence,natural language processing,deep learning,image recognition etc,image QA has a wide range of applications in the fields of car navigation,blind road,robot system etc.Because computer is unable to completely perceive the image information like humans,the accuracy of image QA mainly depends on the representation of the image features and the description of language text.In this paper,we research the image feature representation,text representation and their fusion model,and analyze the network structure used in image QA to improve the accuracy of automatic image QA.In the representation of the image,the convolution neural network,which have local area perception,is selected for image representation to obtain the deep image information;In the text representation,we use long short-term memory network which is the characteristics of the information before and after the text can be described,indicate the language description of the problem;The claim of image QA,we select the attention mechanism,fuse the image feature representation and text representation,get the fusion model,form attention network;Because single-layer attention r network and double-layer attention network have a misjudged the situation,we improve fusion model,deepen the network level,enhance the text information and weaken the text "memory" attenuation,form deep attention network,and study the network structure.Finally,we select three accepted image QA sets,namely DAQUAR-ALL,DAQUAR-REDUCED and VQA,to make the image QA based on the method of deep attention network into reality.The comparison between the deep attention network and the double layer attention network model,shows that the deep attention network has higher accurate;In the comparison of different layers of deep attention network experiment,shows that the accuracy of image QA is increased and the number of layers is also increased.The experimental results show that the deep attention network structure in this paper can improve the accuracy of image QA.
Keywords/Search Tags:Image QA, Convolution neural network, Long short-term memory network, Deep attention network, Network structure
PDF Full Text Request
Related items