Font Size: a A A

Research On Reading Comprehension Model Based On Attention Mechanism

Posted on:2021-02-02Degree:MasterType:Thesis
Country:ChinaCandidate:S C ChenFull Text:PDF
GTID:2428330611967517Subject:Control engineering
Abstract/Summary:PDF Full Text Request
Machine reading comprehension(MRC)is a crucial research field in the research direction of natural language processing(NLP).It has become one of the research hotspots of academia and industry.Its research goal is to teach machine learning to do reading comprehension.There are two problems with the existing models.On the one hand,most models use recurrent neural networks to directly train text on the basis of pre-trained vector representation,which leads to insufficient representation capabilities of the final trained semantic vectors to express semantics well;on the other hand,the attention between the question and the context is not fully utilized,which leads to the lack of emphasis on the semantic information extracted by the model,resulting in the relatively poor performance of the model.This paper introduced the relevant content of the MRC model in detail,and made improvements to the shortcomings of the existing models,thereby constructing the two MRC models proposed in this paper,and got better results.The main research contents of this paper can be summarized into the following two aspects:(1)In view of the problem that some models use recurrent neural networks to directly train the pre-trained word vectors,resulting in weak text high-level semantics.This paper proposes a new model——BT-net,which uses the Bert pre-trained language model as the word vector representation of the model,and adds three layers of Transformer to multi-level representation of the text and questions represented by Bert.The focus of attention between the text and the problem is further refined,and it is expected that the model can notice what is relevant to the problem.The output layer adopts skip connection method,combining the output of the first Transformer layer with the other two layers,then concatenate connection,finally predicting the starting position and ending position of the answer.The final experimental results of BT-net were EM72.93% and F1 75.96%.Compared with the basic model QA-net,the EM value is increased by 9.84% and 9.16%,which proves that using the Bert pre-trained language model as the word vector representation brings huge model gain,and the languagerepresentation ability is stronger than the previous word vector model.(2)In view of the problem that the key points extracted from text high-level semantic vectors for the model are not outstanding.This paper proposes Bert-net based on the Bert model and inspired by Bi DAF's bi-Directional attention flow.The model uses Bert-base as the model's word vector input,performs masking on the fusion vector of the problem and context to obtain independent problem and context vectors,and then uses an improved multi-layer collaborative multi-head attention mechanism,use the self-attention mechanism to further process the attention of the contextual text to be extracted,and finally predict the start and end positions of the answer.In this paper,in order to determine the number of layers N of the collaborative multi-head attention layer in Bert-net,a comparative experiment was performed to determine the number of layers.It was finally determined that the performance of the model was the best when N was equal to 7.The EM value at this time was 73.41,the F1 value is 76.36.Compared with the basic Bert model,the EM value and F1 value have increased by 1.9% and 1.89%,respectively.Through the ablation experiments,it is proved that the collaborative multi-attention layer and the self-attention layer improve the problem of the lack of emphasis on the semantic information of the text to varying degrees,and improve the effectiveness of the model.
Keywords/Search Tags:Machine Reading Comprehension, Text Representation, Neural Network, Attention Mechanism
PDF Full Text Request
Related items