Font Size: a A A

Research On Most Influential Community Search In Heterogeneous Information Network

Posted on:2022-12-10Degree:MasterType:Thesis
Country:ChinaCandidate:W YangFull Text:PDF
GTID:2480306779971969Subject:Psychiatry
Abstract/Summary:PDF Full Text Request
Heterogeneous information network(HIN)is a very special network structure in which vertices and edges have different types,that means HIN can express rich semantic information.The community search problem based on HIN aims to identify cohesive communities from the network that contain specific query vertex.The search results pay more attention to the types of vertices and whether the local structure of the network meets the users query needs.It is often used to solve complex protein networks,object recognition and friend recommendation in social networks.Existing community search researches based on HIN usually focused on vertex type,minimum degree and network structure to establish cohesive communities and query cohesive subgraphs.However,these methods ignored the potential impact of vertex influence on query results,and did not consider the upper limit of query result size that users were affected by factors such as cost in practical applications.The main work and contributions of this paper are as follows:First,on the basis of existing methods,the concept of combinatorial constraint model is proposed as the criterion of community cohesion.The model constrains the structure of the community by comprehensively considering vertex influence,community size,community cohesion,and types of vertices and edges,returning a more efficient resulting community.Secondly,a basic algorithm based on vertex enumeration is proposed,which can guarantee the accuracy of the community.On this basis,an optimization algorithm,CIEN,is proposed.The CIEN algorithm uses two methods to significantly improve the query efficiency.(1)compress the enumeration vertex size before the enumeration process through the preprocessing operation,thereby reducing the enumeration branches.(2)The enumeration process is pruned according to the relevant properties of the influence of subgraphs,so as to avoid unnecessary redundant calculation processes and reduce the enumeration depth.Thirdly,aiming at the limitation of vertex enumeration algorithm,a grouping enumeration algorithm named GIEN was proposed.Based on the potential connection between different types of vertices in the network,the algorithm groups the enumerated vertices,and improves the query efficiency and the redundant calculation problem existing in the vertex enumeration algorithm by means of the combination of enumerated vertices.In addition,an NCI index for recording vertex type,threshold information is proposed.Based on this index,the efficiency of checking community cohesion during the execution of the algorithm can be improved.Finally,verified the effectiveness and efficiency of the algorithms proposed in this paper on 10 datasets.The experimental results showed that these algorithms could accurately and efficiently search for the most influential communities from heterogeneous information networks.
Keywords/Search Tags:heterogeneous information network, community search, influence value
PDF Full Text Request
Related items