Font Size: a A A

Research On Biomedical Entity Relation Extraction With Knowledge Base

Posted on:2021-03-22Degree:MasterType:Thesis
Country:ChinaCandidate:C K LangFull Text:PDF
GTID:2428330626960361Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of Internet technology and the arrival of the era of big data,the number of biomedical literature keeps exploding.How to extract structured information from a large number of unstructured biomedical literature texts becomes an urgent problem.Entity relations extraction is one of the key tasks of structured information extraction.It aims to discover the semantic relations between entity pairs in texts.In biomedical domain,there are a large number of chemical-induced disease relations between chemical entities and disease entities.This article focuses on extracting this relation.The main research contents are as follows.(1)Entity relation extraction based on contextual semanticsExplore the influence of semantic information on the chemical-induced disease relation extraction performance.First,the sample screening method is used to divide chemical-induced disease relation extraction task into intra-sentence instances and inter-sentence instances.Then,CNN,BiLSTM and Transformer are used to construct context-based and entity attention-based entity relation extraction models,exploring the impact of different context sequence inputs and different feature selection methods on context semantic information mining performance.Experiments show that the method based on the shortest dependency path and entity attention effectively improves the entity relation extraction performance.(2)Entity relation extraction based on knowledge representationsThere are a large number of knowledge bases in biomedical domain.The large amount of structured knowledge contained in these knowledge bases provides strong guidence for biomedical entity relation extraction.First,the TransE is used to learn the structured knowledge in the knowledge base and obtain the knowledge representations.Then,gated convolutional neural network and gated Transformer are applied to control the expression of context information based on the knowledge representations.They deeply integrate the structured knowledge in knowledge bases and the free text information,constructing a high-performance entity relation extraction model based on knowledge representations.The gated convolutional neural network and the gated Transformer can effectively merge knowledge information and text information.The introduction of knowledge representations significantly improves the performance of chemical-induced disease relation extraction.(3)Entity relation extraction based on distant supervisionIn addition to manually labeled corpus,there are a large number of unlabeled texts in biomediccal domain.The effective use of these texts can partly solve the problem of insufficient training data in biomedical entity relation extraction.First,large-scale unlabeled texts are aligned with the knowledge triples to obtain a distantly supervised labeled corpus,which contains some noise.In order to remove the noise in distantly supervised corpus,the encoded semantic representations are converted from noisy space to clean space or from clean space to noisy space through a noise converter,and then used for relation extraction.Experiments show that entity relation extraction based on distant supervision can make full use of the knowledge bases and unlabeled texts,and effectively improve the entity relation extraction performance.The research in this paper can effectively improve the performance of chemical-induced disease relation extraction.It can also be extended to other relation extraction tasks in other domain if there is a domain knowledge base.This research is of great domain universality.
Keywords/Search Tags:CDR, Knowledge Representations, Dependency Information, Knowledge Bases, Distant Supervision
PDF Full Text Request
Related items