Font Size: a A A

Study On Named Entity Recognition Method Of Clinical Cardiology Text

Posted on:2021-03-01Degree:MasterType:Thesis
Country:ChinaCandidate:B J ZhangFull Text:PDF
GTID:2504306047981309Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
China is a populous country with a high incidence of heart disease and a large number of patients.Every year,a large number of patients with heart disease are admitted to hospital for clinical treatment,and a large number of clinical medical texts recording patients’ medical processes which also containing rich medical information are produced,If we can use relevant technologies to extract and mine the information in medical texts,it will greatly improve the work efficiency of medical personnel,and also contribute to the diagnosis and treatment of patients to a certain extent.Therefore,the main research content of this thesis is to identify and extract medically related named entities in the clinical text of heart disease.The medical text data used in this article is from 3000 real clinical electronic records of heart disease.Through the analysis and research of the original electronic medical records,and according to the characteristics of their organizational structure and text content,the text data recording key medical information was extracted as the corpus of this study.After that,we built an authoritative medical term dictionary and used the word segmentation tool to realize the preprocessing of medical text such as word segmentation and part of speech tagging.On this basis,the text is marked with entity,and finally the data set for medical entity recognition task is constructed.In terms of the algorithm of entity recognition,this thesis proposes the application of the character of Chinese characters to the task of medical entity recognition,Through the construction of Chinese word dictionary and the use of the extraction and representation algorithm of the features of radicals,the extraction and vectonized representation of the features of radicals are realized.Then,the feature fusion technology is used to fuse and express the features of Chinese characters,words and radicals.On this basis,the widely used in the entity recognition task of Bi-LSTM-CRF model and ID-CNN-CRF model was improved,the feature vector after fusion as the input of the model for learning and training,and the two kinds of neural network for joint,composed of Bi-LSTM-ID-CNN-CRF model to study the characteristic.Finally,the model will be used for clinical heart disease medical text entity recognition.Finally,we carry out experimental verification and analysis of each model on the constructed data set.The results show that the performance of the model can be improved to some extent by applying the feature of part to the entity recognition task,and the combination of multiple features of the joint training also showed a good effect.Later,in order to verify the applicability of the model in this study,we also conducted experimental comparison and analysis on the public electronic medical record data set,and the results showed that the performance of the model on the public data set also improved compared with the benchmark model.
Keywords/Search Tags:Feature fusion, Named entity recognition, Information extraction
PDF Full Text Request
Related items