Font Size: a A A

Research On Machine Reading Comprehension Model Based On Passage Reranking And Hierarchical Information

Posted on:2022-05-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y CongFull Text:PDF
GTID:2518306569981069Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Machine reading comprehension is an open-domain question answering task that aims to enable the machine to answer questions under the premise of given questions and corresponding texts.As a key content in natural language processing,machine reading comprehension tasks will promote the development of natural language research.In natural language,there is a natural hierarchical structure: words,phrases,sentences,paragraphs,and documents.Current studies have shown that hierarchical information can help machines understand natural language in-depth,but most of the past work focused on attention information,location information,and the overall performance of the model,ignoring the role of hierarchical information.In addition,due to a large number of irrelevant paragraphs in the machine reading comprehension dataset,it is also necessary to study paragraph filtering and selection.Therefore,this paper proposes a new model based on the passage reranking framework(Passage reranking framework)and hierarchical information(Hierarchical information),which is called PH-model.The PH model contains a passage reranking framework and a hierarchical neural network model.This paper proposes a sort method based on F1,BLEU,and ROUGE-L in the framework of passage reranking.This new method combines a heuristic paragraph processing strategy to filter irrelevant paragraphs.In the hierarchical neural network model,this paper proposes to encode the word vector by combining the hierarchical information of paragraphs,and combines the bidirectional attention representation and the hybrid encoding representation.Finally,the pointer neural network is used to predict the answer to solve the task of machine reading comprehension.Among them,the hierarchical encoding layer and hybrid encoding layer proposed in this paper respectively realize the encoding of paragraph hierarchical information and the "rereading" mechanism of information.In the hierarchical encoding layer,ordered neurons LSTM is used to extract hierarchical information,and Gumbel-Softmax is used to solve the boundary ambiguity problem of hierarchical information.Simultaneously,to simulate the human's "rereading" phenomenon of text,a hybrid encoding layer is proposed.The hybrid encoding layer mixes independent parameter encoding and shared parameter encoding to realize the "rereading" mechanism of information.The fusion layer fuses attention representation and hybrid encoding representation and makes dimension specification passed to the pointer neural network.Pointer neural network encodes the relevant paragraphs of the question to generate probability distribution,thus obtaining the predicted answer.The experimental results show that the PH model achieves better performance than other models on the Du Reader 2.0 data set.Compared with the ROUGE-L score of 55.30% of the pre-trained BERT model,the PH model obtains the latest performance of 56.42%,and the absolute performance improvement is more than 1%.Compared with the performance score of36.54% of the baseline model,the PH model achieved a significant improvement of 19.88% on ROUGE-L.Finally,the ablation experiment verifies the effectiveness of the passage reranking framework and hybrid hierarchical encoding,which supports the theory and design of the model.
Keywords/Search Tags:Natural language processing, Multi-passage reading comprehension, Hierarchical information, Passage reranking
PDF Full Text Request
Related items