Font Size: a A A

The Research On Complex Conditional Community Search In Heterogeneous Information Network

Posted on:2022-06-08Degree:MasterType:Thesis
Country:ChinaCandidate:J YangFull Text:PDF
GTID:2480306332474094Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The community is an important attribute of the information network.The nodes in the community are closely connected,while the connections between the communities are relatively sparse.Community search aims to find a community that contains a given node,and has a wide range of application scenarios in real life,such as event organization,friend recommendation,protein identification,and e-commerce advertising promotion.Community search has received more and more attention from researchers in recent years due to its rapid and personalized advantages,and has become one of the important research contents of information network analysis.Heterogeneous information networks have more complex network structures,node relationships,and richer semantic information.The communities searched in heterogeneous information networks contain richer information,which can provide more powerful support for various applications.The search conditions of the existing community search algorithms for heterogeneous information networks are relatively simple,involving only a single search node and a single symmetric meta-path,and it is difficult to deal with search problems with complex conditions,such as asymmetric metapaths,restricted meta-paths,and forbidden node constraints.That is,the types of nodes in the result community are different from the types of search nodes,the nodes in the result community are connected by meta-path instances containing constraint objects,and the result community cannot contain certain types of nodes.Compared with existing research,community search with complex conditions faces more challenges.This thesis studies the problem of community search with complex conditions in heterogeneous information networks.The main work includes:(1)Aiming at the asymmetric element path,a Community Search for Asymmetric Meta-Path(CSAMP)algorithm is proposed,which completes asymmetric meta-paths into symmetric meta-paths through meta-path completion strategies,so that the symmetric meta-path community search algorithm can be used to find communities.(2)For restricted meta-paths,a Community Search for Constrained Meta-Path(CSCMP)algorithm is proposed.The CSCMP algorithm only uses the meta-path instances that contain constraint objects to find new neighbors in the process of finding neighbor nodes through the breadth-first search strategy,ensuring that nodes in the resulting community are connected by restricted meta-path instances.(3)For prohibited node constraints,a Community Search for Prohibit Node Constraint(CSPNC)algorithm is proposed,whhich first searches the initial community formed by all nodes closely related to the query node through the CSCMP algorithm.Then,through the CSCMP algorithm,a forbidden community composed of nodes closely related to the forbidden node is obtained.Finally,the relative complement set of the forbidden community in the initial community is obtained,and the community composed of nodes in the complement set is used as the search result of the CSPNC algorithm.(4)In order to improve search efficiency,CSPNC-PS(CSPNC with Pruning Strategy)algorithm using pruning strategy and CSPNC-AS(CSPNC with Approximation Strategy)algorithm using approximate strategy are proposed.The CSPNC-PS algorithm proposes a pruning strategy based on the feature that "community search for any node in the community can only get the community",that is,when the query node already exists in the community,no community search is performed on the query node to avoid Repeated search operation.The CSPNC-AS algorithm uses an approximate strategy to find a set,that is,a set of nodes with at least k neighbors among all nodes connected to the forbidden node through a meta-path,and this set is approximated as a forbidden community.(5)A large number of experiments were conducted on the proposed 5 complexcondition community search algorithms on 6 real data sets.The CSAMP algorithm and the CSCMP algorithm are mainly evaluated from the two aspects of community retrieval ability and algorithm efficiency.For the CSPNC algorithm,CSPNC-PS algorithm and CSPNC-AS algorithm,this article uses six indicators to evaluate the effectiveness of the algorithm,and also evaluates the algorithm.Operating efficiency.Experimental results show that the proposed algorithm has good performance.
Keywords/Search Tags:Heterogeneous information network, Community search, Meta-Path, Complex conditions
PDF Full Text Request
Related items