Font Size: a A A

Research On Semantic Meta Path Analysis Method Of Heterogeneous Information Network

Posted on:2020-08-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y Y ZhengFull Text:PDF
GTID:1368330605981319Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Information networks are ubiquitous,the networks composed of different components can be called information networks.As a research hotspot in the field of data mining,current information network analysis is mainly based on the homogeneous information network,that is,a network containing the same type of objects and links.However,this type of modeling can result in in-complete or loss of information.As a result,many researchers begin to model these interconnected multi-type networked data as the heterogeneous informa-tion network,i.e.a network containing different types of objects and links.Heterogeneous information network is not only capable of modeling simple patterns of networked data,such as scientific literature data,but also capable of modeling networked data of complex structures,such as knowledge graph data represented in the form of triples.Compared with homogeneous infor-mation network modeling,heterogeneous information network modeling can more fully represent the components of system and the relationship between them,which will lead to more meaningful knowledge discovery.Objects and links in the heterogeneous information network contain rich semantic information,and meta path is a sequence of relationships linking ob-ject types,it can capture such semantic information.Many data mining tasks in the heterogeneous information network are also based on meta path.There-fore,this paper takes the heterogeneous information network as the research target,focuses on the research of heterogeneous network analysis method from the perspective of meta path.Although there have been many related work which employs meta path in heterogeneous information networks,the research still faces the following challenges:(1)The network contains many complex semantic relationships,current similarity methods based on meta path are dif-ficult to capture the semantics of complex relationships,and can not meet the needs in complex applications.(2)Semantic description of meta path is rela-tively simple.It can only express a single information and can not express more subtle semantic information.(3)The complex heterogeneous network contains many types of objects and relationships and has no simple schema,the number of meta path is too large to enumerate.Aiming at aboves challenges,this pa-per begins the progressive research from three perspectives of meta path,subtle meta path and automatic discovery of meta path.The main works include the following aspects:1.Aiming at the problem that current similarity calculation method based on semantic meta path can only measure the similarity between objects.This paper studies the measurement about closeness of the object and set in hetero-geneous information networks,and proposes a method of optimal set discovery based on approximate density subgraph.This method employs meta path and embedding to construct a weighted heterogeneous information network,intro-duces the concept of quasi clique in the maximum density subgraph to discover the set being closest to given object.And we model the problem of author set prediction and carry out the experimental verification and analysis.2.The meta path can only represent a single semantic information.In order to represent more subtle semantic information,this paper proposes the weighted hierarchical meta path and further studies the concept semantic simi-larity measure method in heterogeneous information networks,and proposes a concept semantic model based on weighted hierarchical meta path.The model employs conditional probability to compute the weight of edges in weighted hierarchical meta path,and integrates the information of structure density and depth of concepts.Then the weighted path is combined with the information content of the concept to measure the similarity between concepts.Finally,the experiments and analysis are carried out on the classical word similarity dataset and the aspect categorization classification task.3.Aiming at the problem that the meta path needs to be appointed by do-main experts,this paper studies the automatic discovery of meta path in hetero-geneous information networks,and proposes a method of meta path automatic discovery for entity set expansion.This method traverses the network in depth and breadth,and employs the tree structure to describe the process of automatic discovery of meta path.Aiming at the different importance of meta path,this pa-per further studies the weight learning of meta path,and proposes heuristic and semi-supervised learning methods.Finally,we build the model for the problem of entity set expansion,and validate the effectiveness of proposed method on Yago dataset.4.Aiming at the problem of the slow efficiency of meta path automatic discovery,this paper studies how to efficiently discover meta path in heteroge-neous information networks,and proposes an efficient method of discovering meta path based on frequent patterns.Inspired by Aprior algorithm,this method maps an entity into a transaction in a transaction database,maps the relationship of entities as the transaction items,and sets the minimum support threshold to find frequent relationships,then connects the relationships to obtain the meta path that can reveal the characteristics of entities.Finally,the meta path is weighted to build a model of entity set expansion.The experimental analysis on Yago proves the efficiency of meta path discovery of proposed method.
Keywords/Search Tags:Heterogeneous information network, knowledge graph, meta path, similarity measure
PDF Full Text Request
Related items