The Study Of Normalization Of Clinical Terms From Electronic Medical Records

Posted on:2021-07-30

Degree:Master

Type:Thesis

Country:China

Candidate:W Yuan

Full Text:PDF

GTID:2504306017459864

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Electronic medical record(EMR)has become an indispensable part of the work of medical institutions,it contains important information such as clinical discovery,diagnosis,drug prescription and so on.The information has been applied to studies of Natural Language Processing(NLP)in clinical field such as clinical decision-making,mortality prediction and adverse drug reactions analysis.However,different medical institutions have different standards for the writing of EMR,clinical terms normalization can improve the ability of sharing clinical information between different institutions and the interoperability between different application platforms in the clinical field.and it can improve the quality of data and help optimize the machine learning model based on EMR data.In this thesis,the research is based on the clinical terms normalization task released by the national NLP clinical challenges(n2c2)in 2019.It needs to map the clinical terms from EMR to the concept unique identifier(CUI)in the unified medical language system(UMLS),every CUI has several describe string.This thesis focuses on the scarcity of clinical terms normalization corpus and the difficulty of existing normalization methods to solve the problem of different word forms with the same meaning.The research contents are as follows:(1)This thesis proposes a method of transferring the word features of the pre-training language model in clinical domain into the Siamese recurrent neural network.Traditional research methods use feature engineering combined with machine learning to avoid the need of large-scale corpus,but it needs to define the feature extraction method.Siamese network,which uses the same sub network to process similar input,is suitable for calculating semantic similarity.It performs well in small-scale corpus,but it has not been applied to the normalization of clinical terms.In this thesis,the word features of pre-training model in clinical domain are embedded into the Siamese recurrent neural network as the initial word vector to normalize the clinical terms.Through comparative experiments,several different pre-training language models and different recurrent neural networks are selected,and compared with the common term normalization system MetaMap,which proves the effectiveness of the method in the small-scale annotation corpus.(2)This thesis proposes a method of crosslingual texts to calculate the similarity.For the large scale of UMLS,candidates generation of CUI description string is necessary.The traditional candidate set generation method is based on morpheme variants and common words,which can not solve the problem of different word forms with the same meaning.This paper proposes a method of crosslingual texts to calculate the similarity,it compares the semantics of current languages through comparing the semantics of other languages.This method can not only compare synonyms,but also add and delete words,adjust sentence structure and word order.In this thesis,word character based methods and term frequency-inverse document frequency(TF-IDF)based methods are applied to the generation of candidate sets,then the method of crosslingual text similarity calculation is used to supplement or update the options of the candidate set.The comparison experiment shows that this method effectively improves the recall rate of candidate set and the accuracy rate of normalization.

Keywords/Search Tags:

Clinical Terms Normalization, Semantic Text Similarity, Siamese Network, Recurrent Neural Network, Crosslingual Text Similarity Comparison

PDF Full Text Request

Related items

1	Research On Recommendation Of Famous TCM Cases Based On Similarity
2	Research On Siamese Cross Contrast Neural Network Adapting To Small Medical Dataset
3	An Automatic Grading System For Electronic Medical Records With Neural Network
4	Semi-self-supervised Learning Method Based On Semantic Text Similarity Of Small Sample Electronic Medical Record
5	Research On Biomedical Text Mining Method Using Semantic-enhanced
6	Research On User Semantic Matching Within The Field Based On Rules And Contrastive Learning
7	Medlinmedline Biomedical Text Clustering
8	Prioritization Of Candidate Disease Genes By Combining Topological Similarity With Semantic Similarity
9	Research And Implementation Of Structuring Processing Approach For Medical Semantic Understanding
10	Prioritization Of Candidate Disease Genes Based On Topological Similarity And Optimized PPI Network