Font Size: a A A

Research On Machine Reading Comprehension And Textual Question Answering

Posted on:2020-06-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:M H HuFull Text:PDF
GTID:1488306548992149Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Textual question answering(QA),which aims to build computer systems to answer any natural language questions,is one of the most elusive challenges in Natural Language Processing and Artificial Intelligence.Reading comprehension style question answering,also named machine reading comprehension(MRC),is a subtask of textual QA,and has drawn massive attention from both academia and industry in recent years.The goal of MRC is to teach machines to read and understand human language text so as to answer related questions.Since this task can be naturally used to evaluate how well machines comprehend natural language,it thus possesses great research values.Moreover,MRC techniques can also be widely applied to QA applications,search engines,as well as dialogue systems,therefore having huge practical values.In recent years,with the release of large-scale reading comprehension datasets and the fast development of Deep Learning technology,research on MRC has achieved significant progress.Although great success has been gained,there still exist many challenges such as: 1)there are some problems lied in both network structure and training method of current approaches that restrict model performance;2)current state-of-the-art ensemble models have low efficiency when deployed in real-world applications;3)traditional approaches are usually designed under the assumption that there must exist an answer in the passage,and thus they can not well handle unanswerable questions;4)most of current models are designed under single-passage setting,and they can not be effectively extended to open-domain scenario;5)current models lack versatility in that they are unable to support numerical reasoning,multi-answer prediction and so on.Therefore,to address the above challenges,this paper focuses on machine reading comprehension,conducting several technical research as well as empirical analysis with respect to attention mechanism,training method,knowledge distillation,answer verification architecture,open-domain question answering,and multi-type multi-answer prediction mechanism.The main contributions of this work are summarized as follows:Firstly,to address the issue of attention redundancy/deficiency in current multi-attention architectures as well as the convergence suppression problem in reinforcement learning training methods,we propose a reinforced mnemonic reader for extractive MRC,so as to boost model performance.The proposed model introduces a reattention mechanism in the multi-attention architecture,and utilizes a dynamic-critical reinforcement learning method during training,so as to address the above problems.Experimental results on reading comprehension benchmarks reveal that the proposed model obtains significant performance improvements compared to previous approaches,and our ensemble model even achieves human-level Exact Match(EM)performance.Secondly,to address the biased distillation problem occurred during knowledge distillation as well as the issue of being unable to efficiently distill intermediate representations,we propose an attention-guided answer distillation approach for MRC model compression,so as to increase model efficiency.This approach jointly utilizes vanilla knowledge distillation,answer distillation,and attention distillation to compress an ensemble model into a single model with little performance loss.Experiments on three reading comprehension benchmarks show that the efficiency of distilled single model has been largely improved,and the single model even outperforms the ensemble model on two datasets.Thirdly,to address the probability interference problem in no-answer reader as well as the issue of lack of an independent answer verification stage in current approaches,we propose a read + verify architecture for reading comprehension with unanswerable questions,so as to increase detection accuracy against unanswerable questions.This architecture consists of a no-answer reader for extracting candidate answers and detecting unanswerable questions,as well as an answer verifier for further judging whether the predicted answer is correct.Moreover,we introduce two auxiliary loss functions to enhance the no-answer reader,and explore three different model structures for the answer verification task.Experiments on the SQu AD 2.0 dataset show that the proposed architecture obtains significant improvements on detection accuracy against unanswerable questions.Fourthly,to address the train-test inconsistency issue and the problem of re-encoding inputs in current pipelined methods,we propose a retrieve-read-rerank network for opendomain question answering,so as to improve the performance of open-domain QA systems.This model contains an early-stopped retriever,a distantly-supervised reader,and a span-level answer reranker.These components are assembled into a unified neural network for end-to-end training in order to alleviate the train-test inconsistency issue.Besides,encoded representations can be shared across multiple components to avoid reencoding inputs.Experiments on four open-domain QA datasets show that the proposed model outperforms previous pipelined methods in both effectiveness and efficiency.Fifthly,to address the problems lied in current discrete-reasoning MRC models such as incomplete answer type coverage,being unable to support multi-answer predictions,and isolate predictions of arithmetic expressions,we propose a multi-type multi-span network for discrete-reasoning reading comprehension,so as to improve model performance on discrete-reasoning MRC.This model uses a multi-type answer predictor to support the predictions of four answer types,adopts a multi-span extraction method to dynamically extract one or multiple text spans,and utilizes an arithmetic expression reranking mechanism to rank expression candidates so as to further confirm the prediction.Experiments on the discrete-reasoning reading comprehension benchmark show that,the proposed model significantly increases the degree of answer type coverage and the accuracy of multi-answer prediction,thus outperforming previous approaches by a large margin.
Keywords/Search Tags:Textual Question Answering, Machine Reading Comprehension, Attention, Answer Verification, Open-Domain Question Answering, Discrete Reasoning
PDF Full Text Request
Related items