Font Size: a A A

Mutual Information Based Modeling And Completion Of Correlations In Knowledge Graphs

Posted on:2019-01-25Degree:MasterType:Thesis
Country:ChinaCandidate:W XiaFull Text:PDF
GTID:2428330548474401Subject:Science and Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet and the popularity of Web 2.0,data analysis and knowledge discovery management are confronted with new challenges.The appearance of Knowledge Graph(KG)provides a novel organizational form for massive data and domain knowledge.The complement of KG which mainly involves filling the missing entities and completing the missing relationships between entities,is one of the topic with great attention in the KG research field.Besides,KG's completion is also solid foundation of information retrieval and service in the context of massive data.Meanwhile,with the rapid generate of user-generated data(UGD)(user browsing webpages,products,etc.),the relationships among entities revealed by UDG may have differences between the knowledge described by KG which may benefit the supplement of KG.At present,KG is complemented by knowledge inference method of KG path,for example,the path-ranking algorithm,due to the sparse and wrong entity relationships and their poor connectivity,it would result in inaccurate entity relationships extraction and incompleteness of KG.Therefore,starting from UGD,this thesis use the mutual information method to build the model of relationship between uncertain knowledge,and then obtains the entity nodes with associated relationships.Consequently,we can complete the missing relationship of KG's entities and gain a more real and authentic KG,as the basis of personalized recommendation and correlation query processing.Specifically,the main work is as follows:(1)UGD contains a large number of entities and the relationship between entities,which can make up for the lack of entity relationships in KG.For a large number of entity nodes in UGD,we use the mutual information method to quantitatively calculate the degree of association between entity nodes based on the Spark distributed computing framework,and determine the direction between them based on the size of the mutual influence between the entity nodes,so as to build an entity node association model,and then construct Entity Association Graph(EAG)based on the “entity-association value-entity” triplet.(2)In EAG,in addition to directly associated entity nodes,there may be potentially associated entity nodes.Therefore,we use a concept of associated impact superposition to calculate the potential associations among entity nodes.At the same time,GraphX provides the calculation method of neighbor nodes in the graph,which is convenient for the calculation of graphs in this thesis.(3)In this thesis,we use Taobao users' actual behavior records as the experimental data set.The experimental results verify the efficiency and effectiveness of the proposed method.Based on the method proposed in this thesis,we design and implement a prototype system based on "Mutual Information Based Modeling and Completion of Correlations in Knowledge Graphs platform" and it shows the specific process of KG completion.
Keywords/Search Tags:Knowledge graph, Completion, User-generated data, Mutual information, Association impact
PDF Full Text Request
Related items