| Machine reading comprehension is one of the key research areas in the field of artificial intelligence,and the task aims to evaluate the ability of artificial intelligence to understand the deeper meaning of human language.As reading comprehension technology has evolved,the multiple-choice reading comprehension datasets for Gaokao has received a lot of attention due to its broad test coverage and its complexity.The answer to these questions involve complex semantic reasoning,so it is extremely interesting for exploring the application of natural language processing in realistic human scenarios.In this paper,we study the method of answering multiple-choice questions for Gaokao Chinese reading comprehension,and divide it into three subtasks of evidence extraction,evidence summarisation and evidence verification to optimise the answering process and enhance the system’s answering ability and interpretability.The main research elements and results are as follows:(1)In the subtask of evidence extraction,to solve the problems of insufficient evidence extraction using deep inference and model classification error due to training sample bias,this paper proposes an evidence extraction method based on BERT and adversarial training.The method first uses BERT as a language model for coding,then adds a GRU layer to the model to learn location information attenuated on BERT,then uses a Focal Loss loss function to alleviate the problem of prediction bias,and finally uses adversarial training to improve the generalisation ability of the model.Experimental results on the corresponding datasets show that test extraction performance improves by almost 6%.(2)In the subtask of evidence summarization,to address the weak semantic induction ability and insufficient consistency of the generated text in existing generative models,this paper proposes a evidence summarization method based on BART and contrastive learning.The method first uses BART to encode evidence,then introduces semantic fragment cues for dynamic content planning and sentence generation,and finally develops a contrastive learning task to improve output sentence consistency.Experimental results on relevant datasets show that the sentences generalized using this method are close to the content of the alternatives and have high consistency.(3)In the subtask of evidence verification,to address the problems that important information,such as relationships between and within sentences,is easily ignored by text modeling methods,and that it is difficult to grasp the relationship between primary and secondary evidence when multiple evidence are involved.The method first constructs a graph of evidence chain to model the choice of items,evidence,and questions,solving the problem that it is difficult to combine multiple levels of information when modeling a sequence,and then obtains different attention weights for multiple evidence so that the model focuses on the evidence most relevant to the question.Adopt an attention mechanism to address the problem of semantically combining multiple pieces of evidence.Experimental results on the relevant datasets show an almost 5% improvement in response rate over the baseline.(4)Design and implementation of a reading comprehension response system for GCRC.Models trained with the evidence extraction,evidence induction,and evidence checking methods proposed in this paper were used in a modular fashion to answer one-click questions. |