Font Size: a A A

Research And Application On Relation Extraction Of Biomedical Text

Posted on:2022-08-06Degree:MasterType:Thesis
Country:ChinaCandidate:M Z ZangFull Text:PDF
GTID:2480306311453824Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Massive unstructured biomedical texts contain a wealth of biomedical knowledge,which usually exists in the form of relationships.Therefore,extracting these relationships from these biomedical texts is of great significance for biomedicine.With the rapid development of biomedical text mining,the efficiency and accuracy of relation extraction models based on machine learning have gradually improved.As a result,these methods have become a hot research topic.This paper focuses on the research and application of relationship extraction for biomedical text.At present,the cost of manual lannotating for relation extraction training data sets is high,and most of them are based on the Unified Medical Language System(UMLS),so the relational granularity is large and the types are single.Aiming at the above shortcomings,this paper researches on automatic labeling algorithms for biomedical texts,which can provide more accurate training data for relation extraction and other data mining tasks;aiming at the problems of biomedical texts with different sentence lengths,large academic vocabulary and weak model denoising ability,this paper studies the distant supervision relationship extraction model with the attention mechanism and the ontology restraints;based on the above results,a visualization system is constructed to provide users with entity relationship information,it is the application for relationship extraction.The specific research contents are as follows:(1)Research on biomedical text automatic annotation algorithmThis paper proposes an automatic annotation algorithm for biomedical texts,which includes a data acquisition module,an entity relationship triples expansion module,a named entity recognition module and a negative word relationship annotation module.In the named entity recognition module,the recall of entity annotation is increased by introducing the entity unified algorithm based on word vector;the accuracy of entity annotation is increased by introducing the entity disambiguation module based on syntax analysis;and the noise of relationship annotation is reduced by adding the negative word relation annotation module.This paper uses this algorithm to automatically annotate text in biomedical abstracts and obtains the PSREData data set.Compared with the existing automatic annotation algorithms,experiments show that the proposed algorithm can annotate more entities in named entity recognition and less noise in relation annotation.(2)Research on distant supervision relation extraction model combining attention mechanism and ontologyAiming at the characteristics of biomedical text sentences with different length and academic vocabulary,this paper proposes a distant supervised relation extraction model(APCNNs+OR)based on attention mechanism and ontology.The model includes feature engineering extraction module,classifier module and ontology constraint layer.In the classifier module,the method improves the instance level attention mechanism to learn the weight of each sentence in the data bag better,and effectively reduces the noise interference caused by the distant supervised hypothesis and the word information between the entities in the sentence.In the ontology constraint layer,biomedical ontology is introduced to constrain the extraction results,so as to improve the accuracy of extracting biomedical relations,reduce the interference between synonyms,and prevent the obtained relationships from contradicting the reality.The experimental results on SemMed,GoldStandard and PSREData show that the proposed model can effectively reduce the noise interference of false tags,and has better relationship extraction performance than existing models.(3)Research on the relationship extraction visualization system construction for biomedical textBased on the automatic text annotation algorithm and relationship extraction model proposed in this paper,a relationship extraction visualization system for biomedical text is constructed.It is the application of relationship extraction.Through the system,users can query the treatment situation between drugs and diseases,the interaction between diseases and other types of relation.They can learn the relevant knowledge in the biomedical field with the system.Compared with others,this system also adds information such as relationship extraction method and relationship source sentence.With these information,scholars can directly find the corresponding paper,which saves the time of searching related content.
Keywords/Search Tags:relation extraction, ontology, knowledge graph, automatic annotation
PDF Full Text Request
Related items