Font Size: a A A

Research And Implementation Of Knowledge Extraction For Domain Knowledge Graph Construction

Posted on:2022-05-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q DingFull Text:PDF
GTID:2518306341453664Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the electronification of power grid data,the amount of text for failures in power grid equipment has grown rapidly.To make use of those resources,the knowledge within the text of the equipment fault data from power grid needs to be structured,in order to construct a knowledge graph.Knowledge extraction is usually used to construct the knowledge graph.It extracts structured knowledge triples from unstructured text.Named entity recognition and relation extraction are the main tasks of knowledge extraction.However,the existing named entity recognition models lack the specificity of the field,and most of them combine the prediction of entity location and entity category,which results in the accumulation of errors.In relation extraction tasks,the training of supervised learning models relies on manual annotation,but due to the professionalism of power grid domain,the cost of manual is high,which makes the demand hard to meet.To solve the problems above,a boundary-aware model based on multi-task learning is proposed,and innovations are made in the task construction and domain entity information utilization in the model of named entity recognition.This model is based on Transformer with multi?head attention mechanism,and decomposes traditional tasks into entity boundary-aware tasks and entity classification tasks,and carries out multi-task learning to reduce the accumulation of errors between tasks.The model in this paper also uses the similarity calculation based on the comprehensive description of the entity category in entity classification task for better pertinence of the domain entity.Experiments are conducted on public data sets and domain data sets to prove the advancement of the model.For relation extraction,a relation extraction model based on multi-class relation attention mechanism is proposed,which is based on remote supervised multi-instance learning to reduce the dependence on manual annotation.In order to improve the pertinence to the entity relationship in equipment fault of power grid domain,the feature of relationship in the domain is used.Experiments are conducted on the public data set and domain data set to verify the model.To build the knowledge graph,firstly,extract domain entities and perfects the domain dictionary.Secondly,use the general knowledge base and the standards of power grid to align knowledge,construct the domain entity relationship data set,and perform relation extraction to obtain the knowledge set.Finally,knowledge fused is used to construct the knowledge graph,and the storage and visualization is accomplished.This paper compares the query results of the equipment fault of power grid domain knowledge graph and the general knowledge graph.Compared with general knowledge graph,it proves that the knowledge graph of the equipment fault of power grid domain has obvious advantages in terms of professionalism,detail and domain specificity.
Keywords/Search Tags:knowledge extraction, knowledge graph, named entity recognition, relation extraction, multi-task learning, multi-instance learning
PDF Full Text Request
Related items