Font Size: a A A

Research And Application Of Entity Relation Extraction Based On Deep Learning

Posted on:2022-06-03Degree:MasterType:Thesis
Country:ChinaCandidate:L Y ZhongFull Text:PDF
GTID:2518306524989419Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
In the field of natural language processing,entity relation extraction is an important method in processing unstructured text,in which the model extracts entity pairs and the relation between them to form a triple in the form of(head,relation,tail).The relation triple is then used to further generate an entity relation network,which is the technical basis for building knowledge graph and subsequent expansion research.There are a lot of unstructured texts in biomedical literature and the industry is also interested in the analysis of these texts,so we choose this filed as the application scenario.With the development of deep leaning technology,the entity relation extraction method based on deep learning has achieved better results than those based on traditional feature extraction method.However,the training of deep leaning models requires a lot of labeled data,which is a general challenge in the current entity relation extraction field.Besides,most existing models only weakly connect the two subtasks(entity recognition and relationship extraction),and cannot handle the overlapping triples.In this thesis,we have carried out corresponding research and improvement to face the above-mentioned challenges.In view of the lack of labeled data,we refer to the idea of distant supervision to automatically generate labels using knowledge bases,and then denoise the generated dataset based on our proposed RLDN-RL model to optimize the quality.We select CTD and Open KG as the biomedical knowledge base and obtain unstructured text from Pub Med biomedical literature database.Then aligns the unstructured text to the triples in the knowledge base to automatically obtain labels.Due to the large amount of noise in the labeled dataset generated by the distant supervision method,we use rule-based method and reinforcement learning method to denoise the negative and positive samples respectively to obtain a dataset with higher quality.In order to solve the second problem,we propose the TagRE model,which adopts joint extraction method and the way of redefine subtasks to deal with it.The TagRE model uses joint extraction method which extracts the triple at the same time to avoid the problem of lack of connections between subtasks,as it is mainly caused by the separation of two tasks.The model also redefines the method of splitting two subtasks that the head entity is extracted first,and then the tail entity is predicted according to different relations and extracted head entity.As we model different relations separately,the performance of extracting overlapping triples can be improved in principle.Based on the structured triples obtained from the above data sources,we design and construct a biomedical knowledge graph,which displays all the triples with a graphical interface.Besides,we also build an entity and relation query module,which provides great convenience for scientific researchers and medical workers.
Keywords/Search Tags:Deep Learning, Distant Supervision, Joint Entity-Relation Extraction, Knowledge Graph
PDF Full Text Request
Related items