Font Size: a A A

Application Of Knowledge-embedded Pre-training Model In Reading Comprehension

Posted on:2022-07-17Degree:MasterType:Thesis
Country:ChinaCandidate:R T BianFull Text:PDF
GTID:2518306572959639Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of artificial intelligence,the deep integration of artificial intelligence and traditional industries has been in progress,and many applications that change people's daily lives have appeared.Medical treatment is a field closely related to people's livelihood.The in-depth integration of artificial intelligence and medical treatment will inevitably greatly improve people's living.Therefore,the deep empowerment of artificial intelligence in medical treatment has attracted the attention of many researchers,government personnel,and corporate personnel.The abundant medical information on the Internet enables natural language processing technology to obtain a large amount of required text,making medical information mining possible.This paper focuses on the application of the knowledge-embedded pretraining model in reading comprehension.First,this paper analyzes the knowledge barriers encountered by the existing pre-training models in the transfer to medical field.We enhance the pretraining model with two different kind of knowledge,one is structured knowledge from the knowledge graph and the other is un structured knowledge from the text.Finally,this paper is oriented to the actual medical question and answer scenario,and designs the three modules of the system separately.In the experiment of fusing the structured knowledge from the knowledge graph,this paper first proposes two pre-training models to fuse structured knowledge.One is the knowledge fusion method based on text enhancement,and the other is the knowledge fusion method based on graph representation enhancement method.Finally,the experiment verifies that the knowledge fusion method based on text enhancement has achieved a great improvement compared with the BERT model,and the knowledge fusion method based on graph representation enhancement does not have a great improvement under the model structure of this paper.In the experiment of fusing unstructured knowledge of text,the core idea of this paper is to use mask language model task to allow pre-training to pre-train on large-scale medical texts to learn the implicit knowledge contained in the text.First,the structure of the BERT model used in this paper is introduced,and then three mask tasks are proposed,single word mask,whole word mask,and entity mask.Finally,experiments show that the effect of entity mask is better than that of whole word mask,and whole word mask is better than single word mask.In the design of the medical question and answer system,the whole system is divided into three modules-retrieval,extraction,and sorting.This paper first introduces the BM25 algorithm used by the retriever,and then we use the pretraining model trained in Chapter 2 and Chapter 3 for reading comprehension tasks,and select the best model as the system's extraction model.Finally,this chapter proposes a sorting model based on the self-attention mechanism for candidate answers,and then analyzes the importance of answer context for answer sorting,and proposes a sorting model that combines documents and answers.Experiments show that the two improvements have improved the final sorting performance.
Keywords/Search Tags:pre-training model, knowledge fusion, knowledge graph, machine reading comprehension
PDF Full Text Request
Related items