Font Size: a A A

Machine Reading Comprehension Based On Semantic Information

Posted on:2021-02-15Degree:MasterType:Thesis
Country:ChinaCandidate:C FanFull Text:PDF
GTID:2428330611999997Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
It has always been the ultimate goal in the field of natural language processing to let computers truly understand human language.Machine reading comprehension task is an important method to evaluate the ability of computer to understand human language.In recent years,with the advent of pretrained language model,the computer's ability to understand human language has been further improved,which also makes system performances in many natural language processing tasks have been greatly improved.In some tasks,the F1-score of some models even exceeds human performance.This phenomenon has attracted many researchers' thinking.Has machine reading comprehension been overcome? The answer is clearly no.Researchers have found that some reading comprehension datasets are not difficult enough,so the model can select the correct answers of most questions from the passages only through the information of shallow lexical similarity and answer types.This paper mainly discusses the more difficult task of adversarial reading comprehension,that is,by adding misleading sentences to the text,generating adversarial reading comprehension dataset,so as to evaluate the understanding of the reading comprehension model for the text more precisely.In addition,this paper proposes a reading comprehension model that integrates semantic information to deal with this difficult task.The main contents of this paper are as follows:(1)Adversarial reading comprehension data generation based on transformer.According to the low difficulty in reading comprehension datasets and easy to find the answers,a model of interference sentence generation based on transformer structure is proposed.The rewritten questions and the generated false answers are used as input,and the end-to-end structure is used to generate interference sentences,which are added to the article,and finally the adversarial reading comprehension dataset is generated.Compared with the existing research work,this paper uses the method of training model to automatically generate interference data,which avoids the trouble of manual generation of adversarial data.(2)The passage-problem pair representation structure based on semantic role labeling.The original text structure is relatively flat,in which the semantic information is hidden deeply,which is difficult to be fully mined by the existing model.Therefore,this paper obtains the predicate-argument structure through semantic role labeling technology,and builds a graph based on it to show the semantic relationship of each component in the sentence more intuitively.In addition,this paper also uses coreference resolution technology to fuse the entity representation of coreferential relationship in this paper,and strengthen the semantic relationship between contexts.(3)A reading comprehension model based on semantic information.After constructing the graph representation structure for the passage-question pair,this paper finds the corresponding passage node for the node in the question to match,and then takes the article node corresponding to the wh-word node in the question as the answer node to extract the answer from it.By comparing with other mainstream reading comprehension models,it shows that the SRLG-QA model proposed in this paper has a good performance in the face of difficult datasets.
Keywords/Search Tags:Machine reading comprehension, adversarial dataset generation, semantic roles labeling, Question Answering
PDF Full Text Request
Related items