Font Size: a A A

Research On Deep Learning-based Multi-document Passage Ranking Methods For Question Answering System

Posted on:2021-05-17Degree:MasterType:Thesis
Country:ChinaCandidate:D W LinFull Text:PDF
GTID:2518306548995989Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Automatic question answering systems require the machine can give answers after reading and understanding questions in natural language,which can be used to measure the intelligence level of the computer system.Therefore,the research on automatic question answering has attracted great attention both in academia and industry.Because the knowledge source contains a large number of documents,passage retrieval module is an important information filtering tool for question answering system.Therefore,research on passage ranking is the key technology in the field of automatic question answering.In recent years,with the development of deep learning technology and the release of large-scale benchmark data sets in the field of question answering system and machine reading comprehension,passage retrieval,which is the key technology of question answering system,has also been greatly developed.Although great achievements have been made,there are still many challenges,such as: 1)The extractive framework designed for single document question answering task may loss important information when applied to multi-document ones;2)the sentences used as answer units cannot contain enough information,while the documents have too much redundant information;3)the quality of provided passages in most of the current multi-document question answering data sets is uneven and lack of annotation;4)Most of the current passage ranking frameworks based on pre-trained language models,due to the characteristics of the optimization objective function in the pre-training phase,have problems that restrict the ranking performance of the framework.Therefore,in order to address the above-mentioned challenges,this paper focuses on passage ranking for question answering system and conducts technical research towards the construction of passage ranking framework for multi-document question answering,feasibility of paragraphs as answer units,the relevance measurement between the passage and the question,and how to effectively incorporate self-matching attention mechanism and external features for pre-trained language model-based passage ranker.The main contents and innovation points of this work are summarized as follows:Firstly,in view of the characteristics of multi-document question answering that there are multiple document and answers are longer,this paper suggests that paragraph-level segments are suitable to answer questions and gives the performance of the best-matching paragraphs of each type of questions on the Du Reader dataset compared with the annotated answers as the evidence to support this view.Based on this assumption,we propose a paragraph ranking framework,which uses ROUGE-L as the relevance measurement of paragraphs and listwise approach to directly select the best-matching paragraphs as the answers to questions.Experimental results on a real-world dataset demonstrate that the proposed method obtains a significant improvement compared to the state-of-the-art baselines.Secondly,to address the problem that the current passage ranking framework based on pre-trained language model cannot effectively extract the important information of passages,thereby effectively distinguishing relevant passages from irrelevant ones,a passage ranking model is proposed.This model adopts a passage self-matching attention mechanism for extracting important information of the passage,and at the same time incorporates the question type as the input feature to enhance the model's representation ability.Experimental results on an answer passage re-ranking dataset demonstrate that the proposed model achieves the best results compared to the state-of-the-art models.
Keywords/Search Tags:Question Answering, Machine Reading Comprehension, Paragraph Ranking, Attention Mechanism, Retrieval Question Answering, Pre-Trained Models
PDF Full Text Request
Related items