Font Size: a A A

Research On Venous Thrombosis Entity Recognition And Relation Extraction Based On Interactive Attention And Multi-head Annotatio

Posted on:2024-01-17Degree:MasterType:Thesis
Country:ChinaCandidate:J W ChenFull Text:PDF
GTID:2554307109487884Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As one of the three common vascular diseases,Venous Thromboembolism(VTE)has a fatality rate second only to tumors and myocardial infarction.Early assessment and prevention can greatly reduce the occurrence of venous thromboembolism.At present,manual form filling is the main way to evaluate,but manual form filling is time-consuming and laborious,increasing the work burden of medical staff.In this thesis,information extraction technology is used to extract the risk factors of venous thromboembolism from electronic medical records,and it is applied to judge the risk grade of venous thromboembolism,to assist doctors in clinical diagnosis.The main work content of this thesis includes the following four aspects:(1)So far,there is no entity relationship data set for venous thromboembolism in the medical field.Therefore,this thesis constructs entity relationship extraction data set for venous thromboembolism by manual annotation,in which 12 entity types and 6 relationship types are defined,and 1800 electronic medical records are annotated,including 17170 entity samples and 5326 relationship samples.(2)In the aspect of venous thrombosis risk factor named entity recognition,the lack of word boundary information in the process of Chinese character level named entity recognition is addressed.Combining with the characteristics of electronic medical records with many terms,this thesis proposes a method of naming entity recognition for venous thrombosis based on medical dictionary and interactive attention mechanism.In this method,the position information of the word in the text is obtained by matching the electronic medical record with the dictionary,and then the character vector encoded by the pre-training model is obtained by using the void convolution,and the word vector is generated from the character vector through the position information.Finally,the character vector and the word vector are fused through the interactive attention mechanism to improve the effect of named entity recognition.Compared to the baseline model,the F1 score for entity recognition improved by 1.27% to 93.8%.(3)In the aspect of entity relationship joint extraction of venous thrombosis,the text structure of electronic medical records is complicated,and there are some defects,such as relationship overlap and entity relationship interaction of joint extraction model is not strong.This thesis proposes a joint entity relationship extraction model for venous thrombosis based on multi-head annotation.The model first solves the entity relationship overlap problem through multi-head annotation,then uses Bi LSTM to obtain contextual semantic features,then extracts local context information from the word vector between entity and relationship using average pooling layer,and finally combines sentence vector and distance embedding.The F1 scores of entity recognition and relationship extraction of the joint extraction model are increased by 0.9% and 1.4% respectively,reaching 93.9% and 94.8%.Meanwhile,the reasoning speed of the proposed joint extraction model is faster.(4)After the completion of venous thrombosis risk extraction,the rule reasoning method was used to judge the extracted risk factors,and finally the hit score was used to calculate the risk level of venous thromboembolism.The accuracy of Wells table and Geneva table reached 80.7% and 83.1% respectively,which verified the accuracy in clinical practice.
Keywords/Search Tags:Venous Thromboembolism, Named Entity Recognition, Interactive Attention Mechanism, Joint Extraction, Multi-head Annotation, Auxiliary Diagnosis
PDF Full Text Request
Related items