Font Size: a A A

Research And Implementation Of Automatic Construction Method Of Domain-Specific Ontology In Biomedicine

Posted on:2017-08-16Degree:MasterType:Thesis
Country:ChinaCandidate:F TangFull Text:PDF
GTID:2348330566456736Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of biomedical technologies,the task of protein entity recognition and protein interaction relation extraction has become an important issue in the field of biomedicine.The literature about biological information extraction is explosively growing.For researchers,it is difficult to quickly obtain relevant information about proteins from mass literatures.The purpose of building biomedical protein ontology is to recognize protein entities and extract interaction relations between protein entitis from free text,and help relevant researchers to improve the efficiency of protein information processing.This paper focuses on automatic construction methods of biomedical domain ontology,including protein entitity recogniztion and interaction relations extraction between protein entities.Protein entitity recogniztion task is to recognize entities of proteins from free text,and interaction relations extraction task is to find out the relation between protein and protein.To the task of protein entity recogniztion,this papepr proposes phrase model features based on common basic features about protein entities.Moreover,this paper designed and implemented a protein entity recogniztion approach based on thje Naive Bayes classifier.To the task of interaction relations extraction between protein entities,this paper proposes features of the semantic role model and features of the syntactic analysis model.In addition,this paper constructed features about words,part-of-speech features,and logic features.Furthomore,this paper designed and implemented an interaction relations extraction method based on support vertor machine and the five kinds of features above.The experimental data about protein entity recognition is selected from the GENIACorpus.The experimental results show that the method based on the combination of phrase model features and common basic features improved the accuracy of protein entity recognition,and achieved the higher accuracy than that of the method based on common basic features.The experimental data about interaction relations extraction between protein entities is chosen from IEPACorpus datasets.In our experiments,we built the combination features set which include word features,part-of-speech features,logic features,features of the semantic role model and features of the syntactic analysis model.Experimental results indicated that the method based on the combination feature set obtain the higher performance than those of methods based on each kind of features,and the method in terms of word features,part-of-speech features and any combination of other three kinds features.This paper proposed the method of biomedical domain ontology construction and constructed ontology,which will be helpful for the further research of protein,improve the efficiency of research,and have broad application prospects in the fields of life science,medicine,agriculture and so on.
Keywords/Search Tags:biomedical, Named Entity Recognition, Entity Interaction Extraction, SVM
PDF Full Text Request
Related items