Font Size: a A A

Research And Development On Knowledge Acquisition For Knowledge Graph

Posted on:2019-09-13Degree:MasterType:Thesis
Country:ChinaCandidate:W ChenFull Text:PDF
GTID:2428330545951248Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Knowledge Graph(KG)is composed of interconnected entities and their relations.In other words,KG is a graph with entities as nodes and relations as edges.Generally,we can use triples to represent a pair of entities and their relations.In recent years,as a novel knowledge organization and retrieval technology in the era of big data,KG has gradually manifested its advantages in the organization and display of knowledge and has attracted more attention in academia and industry.This thesis mainly studies knowledge acquisition for KG.The main content is divided into the following parts.(1)This thesis studies entity alias mining and identification,we work from three aspects.Firstly,we use a rule-based method to extract the knowledge base of entity aliases from text containing structured entity information.Secondly,we adopt the bootstrapping method and use the knowledge base of entity aliases as seed words.And then we mine patterns in the text,and choose valid patterns from it to extract aliases of corresponding entities in free text.Thirdly,based on machine learning methods,we build a maximum entropy classifier to automatically determine the alias and further improve the quality of extracting entity alias.(2)This thesis studies attribute knowledge verification and multi-source data fusion.There are conflicting triples when automatically constructing KG.Therefore,knowledge verification is needed to ensure the correctness of knowledge.This article uses a variety of data sources(such as Baidu Encyclopedia,Wikipedia,etc.)to verify attribute knowledge of the triples by voting way.After knowledge verification,this paper uses the Jaccard distance to measure the similarity of texts to fuse multi-source data,we get a higher-quality knowledge base.At the same time this thesis builds a KG system and display knowledge base.(3)This thesis studies the application of KG in automatic keyword extraction.This thesis employs entities in the knowledge base as external resources for the automatic keyword extraction task,in order to further improve the performance of automatic keyword extraction.This thesis proposes a new keyword extraction method based on Bidirectional Long Short-Term Memory Network and Conditional Random Field(Bi LSTM-CRF).In the method,we address the extraction task as a sequence labeling problem.Firstly,the input text is represented as a low-dimensional,high-density vector.Then we get a deeper representation of the text through the Bi LSTM layer.And then we use a CRF layer to decode the whole sequence to get the tagging result.Finally,we conduct experiments on large-scale real data,and achieve good experimental results.This thesis mainly studies knowledge acquisition for KG,including entity alias mining and identification,attribute knowledge verification and multi-source data fusion,and automatic keyword extraction based on KG.In terms of entity alias mining and identification,attribute knowledge verification and multi-source data fusion,we achieve good experimental results.Especially when KG is applied to the automatic keyword extraction task,we improve its performance,and obtain some preliminary results.We expect that these research results can be applied to other natural language processing tasks in future.
Keywords/Search Tags:Knowledge base, Entity alias, Knowledge verification, Keyword extraction
PDF Full Text Request
Related items