Font Size: a A A

Research On Span-Extraction Machine Reading Comprehension Models Based On Pre-Trained Language Models

Posted on:2023-12-09Degree:MasterType:Thesis
Country:ChinaCandidate:T X XiaoFull Text:PDF
GTID:2568306914977139Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Machine reading comprehension is a key technology in natural language processing,which can be widely applied in web search,open domain question answering and other fields.This article mainly studies the span-extraction machine reading comprehension based on the pre-trained language models.On the basis of the baseline model,the performance is improved through various technical means.Finally,on the benchmark dataset SQuAD2.0 of the spanextraction machine reading comprehension,the best results are achieved comparing with methods of other researchers.Aiming at the interaction between paragraphs and questions in reading comprehension,this article innovatively proposes an interaction enhanced network,which mainly includes two technical means:(1)the masked attention layer;(2)introducing word boundary information into the interaction module.The experiment results show that this method can effectively improve the performance of the span-extraction reading comprehension model,and help the answer prediction module to output more accurate answer spans.For the transfer learning of pre-trained language models,this article innovatively proposes a two-stage fine-tuning paradigm.This technology mainly relieves the inconsistency between the pre-training and fine-tuning stages by freezing the parameters during training.This method is able to stabilize the fine-tuning stage of the model,and thus improve the performance.The experiment results show that the pre-trained language model ELECTRA can achieve improvements on the span-extraction reading comprehension and multiple text classification tasks when this method is utilized,which proves the effectiveness of the two-stage fine-tuning paradigm.In order to further improve the performance of the model,this article uses two methods to alleviate the overfitting and improve the generalization:(1)introducing additive Gaussian noise at the input of the model;(2)applying entropy penalty on the output probability distribution.The experiment results show that both methods can effectively improve the performance of the spanextraction reading comprehension model.Finally,the model ensemble is applied to further improve the ability of the model.The ensemble strategy of voting then averaging is used.Length penalty is introduced innovatively to optimize the ranking strategy,and a variety of pre-trained language models are selected to improve the diversity of ensemble models.In the case of combining all the methods proposed in this article,the ensemble model can achieve the the best results in the hidden Test set of SQuAD2.0 comparing with methods of other researchers.This shows the effectiveness of our methods in this article.
Keywords/Search Tags:span-extraction reading comprehension, deep learning, attention mechanism, pre-trained language model, transfer learning
PDF Full Text Request
Related items