Font Size: a A A

Synonymy Analysis And Normalization Of Phenotype Entities

Posted on:2020-02-15Degree:MasterType:Thesis
Country:ChinaCandidate:S W MaFull Text:PDF
GTID:2404330578457406Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The construction of medical knowledge graph is an important issue in medical artificial intelligence research,and it is an important support for the development of clinical diagnosis and treatment decision support system.The phenotype entities and their relationships are important components of medical knowledge graph.However,due to the dynamic changes of medical terminologies and the rapid accumulation of medical texts,the maintenance and updating of phenotype entities and their relationships in medical knowledge graph is time consuming and labor intensive.Therefore,automatically predicting synonymous relationship between phenotype concepts from medical databases,and linking phenotypes in medical texts to standard medical terminology,are basic research work of medical knowledge graph construction.The main research work of this paper is as follows:Firstly,this paper constructs a phenotype synonyms analysis method based on learning to rank model.This method transforms the phenotype synonyms relationship prediction problem into the problem of ranking candidate phenotype terms.The algorithm uses PubMed literature and related information to generate phenotype network embedding representation.The synonymous relationship of different phenotype terms is then predicted by similarity calculation and learning to ranking method.Secondly,a method for synonymous relationship analysis based on classification learning is developed.This method transforms the phenotype synonyms relationship prediction problem into the classification of phenotype relations.Phenotype relationship representations are built based on phenotype network embedding representions.Using SVM,logistic regression,multi-layer perceptron and other classification model to classify phenotype relationships,and then classify phenotype relationships based on the fusion classification model.Further,synonymous relationships between phenotypes can be predicted.In this paper,synonymous relationship prediction experiments based on learning to rank and classification learning methods are carried out on a phenotype synonymous relation dataset.The results show that the two methods have good performance in phenotypic synonym analysis,and the F1 of classification-based method can reach 0.942.Finally,for the phenotype concept normalization problem,this paper devided the problem into two sub-tasks:phenotype named entity recognition and phenotype entity link.First,the Convolutional Neural Network(CNN)and Bi-directional Long Short-Term Memory(BiLSTM)are used to learn the character vector and word vector respectively,and the Conditional Random Field(CRF)is used to construct the BiLSTM-CNN-CRF model to realize phenotype named entity recognition.Then phenotype entity link task is implemented based on CNN sorting model.Comparative experiments were performed on the National Center for Biotechnology Information(NCBI)disease corpus and the BioCreative V Chemical Disease Relation(BC5CDR)disease corpus to verify the effectiveness of the method.
Keywords/Search Tags:Phenotype, Synonymous relationship prediction, Concept normalization, Classification, Neural Network
PDF Full Text Request
Related items