Machine reading comprehension is the core task in the field of natural language processing.The automatic Question-Answering(QA)for college entrance examination(Gaokao in Chinese)is an important challenge in reading comprehension tasks in recent years.In this paper,the answer generation technology is studied for scientific and technical reading comprehension questions in the Gaokao.The main research work of the paper includes:(1)A method of answer sentence extraction based on pre-training model CPT and ensemble learning is proposed.Traditional rule-based matching methods are difficult to extract candidate sentences with deep semantic information.To solve this problem,this paper first uses data enhancement to solve the problem of unbalanced positive and negative samples in training data.Secondly,different candidate sentence prediction models are built based on the CPT model,and the semantic relevance between the question and the candidate sentence is learned from multiple perspectives.Finally,the candidate sentences with the score of Top-6 are extracted as the answer sentences using the ensemble learning method.This method can effectively improve the accuracy of the answers extracted by the QA model.(2)This paper proposes a method of answer sentence summary which combines the pre-training model CPT and integer linear programming.Because the answer sentences extracted by the system are not highly summarized and there are many redundant information,this paper summarizes the extracted answer sentence.First,the optimized CPT model is used to generate a summary sentence containing important information of the original sentence.Then,various constraints are introduced through integer linear programming to generate the answer sentence with high relevance to the question,complete syntactic structure and high fluency.The method is tested on the Beijing Gaokao questions,and the generated answer sentences are superior to various baseline models in multiple evaluation indicators,which verifies the effectiveness of the method.(3)Based on the above methods,we design and implement an answer generation system for Chinese reading comprehension in the Gaokao.The system includes the answer sentence extraction module and the answer sentence summary module.The interface is simple and generous,and the function is complete.It can quickly and accurately answer the Chinese Gaokao reading comprehension questions. |