Font Size: a A A

Research And Application Of Unstructured Association Rule Extraction Based On Knowledge Graphs

Posted on:2020-06-01Degree:MasterType:Thesis
Country:ChinaCandidate:S Y RenFull Text:PDF
GTID:2428330590971715Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Knowledge bases(KBs)store knowledge of real world existence in a structured way.Because its knowledge is easily handled by computers,it plays a vital role in many natural language processing tasks.Current KBs contain far less knowledge than the real world existing,even though they already contain a lot of knowledge in terms of quantity.Therefore,many researches focus on how to enrich the knowledge base with more high-quality knowledge.This thesis proposes a novel natural language enhanced association rules mining(NEARM)framework.And using this rule,the triple facts contained in the natural text can be reasoned to improve the knowledge base.The work of this thesis is mainly composed of the following parts:1.In order to extract the triple facts from the unstructured text,this thesis uses the LSWMD algorithm to calculate the similarity between the relational texts and clusters them using the density peak clustering algorithm.However,the use of this clustering method is very low in time efficiency,so this thesis also uses K-BoD text clustering algorithm to improve it.2.In order to model the clusters into a unified expression,this thesis proposes an improved BoD(BoD-GS,Bag of Distribution Based on Gauss Distribution)and BoD-TDGS(Bag of Distribution Based on Two-dimensional Gauss distribution)to model the relational text.The experimental results show that BoD-TDGS is more accurate and reasonable than BoD-GS.3.By studying the text clusters obtained from the above models,we find that in normal natural language representations,the subjects and objects contained in the texts conforming to these expression patterns tend to have similar property values.Modeling these rules can help us extract triple facts from more texts.Inspired by the association rules mining algorithm,this thesis proposes and studies the enhanced association rules mining of unstructured text,which introduces natural language text into association rules.The obtained association rules make full use of the knowledge in text information and knowledge bases,and can be directly applied to unstructured text to extract triple facts.4.In order to simulate the process of improving the knowledge base by unstructured association rules,this thesis builds a knowledge extraction prototypesystem based on unstructured association rules.5.At last,experiment results demonstrate the effectiveness of the NEARM on relation classification and triple facts reasoning.
Keywords/Search Tags:knowledge base completion, association rules, text clustering, text pattern modeling
PDF Full Text Request
Related items