Font Size: a A A

Research On Span Extractive Machine Reading Comprehension Model Based On QA-NET

Posted on:2022-02-20Degree:MasterType:Thesis
Country:ChinaCandidate:Z H ChengFull Text:PDF
GTID:2518306605465984Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
As is known to all,language is the way for people to communicate with each other.In order to make life more convenient,machines need to do behavior that meets human requirements,but how to make machines understand human language and communicate with human beings is an important process to realize language intelligence.Natural language processing(NLP)is a technology that studies the communication and interaction between humans and machines using natural language.Its main application scenarios include machine translation,automatic text generation,question answering system design,sentiment analysis and machine reading comprehension.Machine reading comprehension is a technique by which a machine understands the content and requirements of a text and provides correct feedback to humans in a given text or related content.It is a very important branch of natural language processing,this paper is to carry out related research on this task.The current machine reading comprehension task has achieved satisfactory results through deep learning,but precision is not the only criterion in the deep neural network test index,and the cost of time and the situation of the hardware should also be considered.The method based on Recurrent Neural Network(RNN)can achieve better precision,but its running time is too long due to the serialization of the structure itself,and the time cost is high.In order to explore a better network structure,QA-Net network model constituted by Convolutional Neural Network(CNN)and Transformer modules is selected as the framework in this paper.But modeling different modules on extracted data sets,this paper mainly includes the following works:1.A machine reading comprehension model based on QA-Net's dilated convolution network and information fusion is proposed.This method uses dilated convolution,by increasing the receptive field to extract the interval information of the feature can make the range of the receptive feature increase,using a fewer convolution layers can extract the information farther apart,compared with the traditional convolutional neural network to save resources.The information fusion part through the attention interaction layer output of the characteristics of multiple processing to retain better features,so that the training is suitable for the network weight to predict more accurate location information.Experimental results show that both dilated convolution and information fusion can improve the accuracy of the model by a small margin,and the two methods are compatible.2.A machine reading understanding model based on QA-Net's persistent attention mechanism is proposed.Since QA-Net network uses multi-layer encoder modules,and the modules contain CNN and Transformer structures,the influence of parameter number on experiments cannot be ignored under a limited hardware conditions.This method can effectively reduce the number of parameters by removing the feed-forward layer inside the module,and the whole module is only composed of depthwise separable convolution and attention mechanism.A new position encoding method is proposed,in which fixed position encoding is used to embed information,and a new encoder module is formed.Experiments show that for different extracted data sets,the module combined with convolutional neural network can effectively reduce the number of parameters of the overall model.3.A multi-dimensional information based on QA-Net is proposed to enhance the machine reading comprehension model.By using self-attention mechanism and convolution modeling,this method enhances feature processing capability.Specifically,after the word vector enters the information enhancement module,the feature is divided into two parts with the same number of channels,which enter the self-attention layer and the convolution layer respectively,and the global feature and local feature are extracted by using the advantages of their respective structures.Because the number of feature channels is halved,the computational amount is reduced compared to the total feature entering the self-attention layer,and the information processing method is changed from the original serial mode to parallel processing,which can represent the word vector more fully from many angles.
Keywords/Search Tags:Natural Language Processing, Machine Reading Comprehension, Transformer, QA-Net, Attention Mechanism, Convolutional Neural Network
PDF Full Text Request
Related items