Font Size: a A A

Research On Chinese Electronic Medical Record Named Entity Recognition Based On Multi-semantic Fusion

Posted on:2023-01-06Degree:MasterType:Thesis
Country:ChinaCandidate:Z SunFull Text:PDF
GTID:2544307028988109Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,artificial intelligence has gradually penetrated into all fields of social life.As a medical field closely related to social life,the establishment of medical information system combining artificial intelligence and medical field has brought a large number of text data of electronic medical records.It is of great significance to extract medical entities in electronic medical records accurately and quickly,and transform unstructured texts into structured texts that can be recognized by computers.In Chinese electronic medical record(EMR)text,the vocabulary in the medical field has a certain professionalism.The current Chinese EMR named entity recognition only migrates the general domain named entity recognition to the medical field,while the professional medical vocabulary is easy to bring difficulties to the division of entity boundary,resulting in the problem of fuzzy vocabulary boundary.The sample size of the existing Chinese electronic medical record dataset is small,and some entities account for a large proportion,which will lead to the result that the model prediction is biased towards more entity types.In order to solve the above problems,this dissertation of degree proposes a multi-semantic fusion named entity recognition method for Chinese electronic medical records based on the radical and four-angle information of Chinese characters.The main contributions can be summarized as follows:1)Proposed the improved Bi GRU-CRF named entity recognition model for Chinese electronic medical records.ALBERT is used to obtain dynamic vector representation,and Mogrifier GRU is used to extract features.On the basis of GRU,the interaction between the hidden layer and the input layer is carried out to enrich the ability of the feature extraction layer to extract the semantic relationship of text,and obtain the deep text features of electronic medical record.2)Proposed a multi-semantic fusion named entity recognition model for Chinese electronic medical records.On the basis of the above model,the word-based Chinese electronic medical record named entity recognition model is used to fuse characters,radicals and four-angle vectors,learn the most basic character structure information,obtain important paraphrase representation of medical professional vocabulary,and improve the extraction effect of the model on medical text information.At the same time,a vector label labeling strategy is proposed to filter the non-entity region in the electronic medical record text with a binary classifier to label the entity type to which the vector belongs.Thus,the entity label information corresponding to each character is introduced into the model to reduce the influence caused by the unbalanced distribution of samples and strengthen the model’s learning of different types of entities.3)The proposed model is verified and analyzed on the Chinese electronic medical record named entity recognition dataset.The experimental results show that Mogrifier GRU,which strengthens the interaction between hidden layer and input layer,can effectively improve the recognition effect of the model,and analyze the influence of glyph information and vector label labeling strategy on the model.The actual effect of each part of the model was analyzed by ablation experiment.Finally,the model is compared and analyzed on the general domain dataset to further verify the effectiveness of the model.
Keywords/Search Tags:Named entity recognition, Electronic medical record, ALBERT, Mogrifier GRU, Vocabulary enhancement
PDF Full Text Request
Related items