Font Size: a A A

Research On Heterogeneous Network Community Detection Based On Meta-path Fusion

Posted on:2024-01-11Degree:MasterType:Thesis
Country:ChinaCandidate:D H MengFull Text:PDF
GTID:2530306920455244Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Community detection is an important research content in the field of network analysis,which can effectively mine hidden information in the network.It has a wide range of applications in product recommendation,public opinion monitoring and advertising.However,most of the traditional community detection methods are based on homogeneous information networks,that is,the network is composed of nodes and edges of a single type,but most networks in real life are closer to heterogeneous information networks,that is,the network is composed of multiple types of nodes or multiple types of edges.Since the heterogeneous information network model can better show the whole picture of the real world,and is also in line with our understanding and cognition of the real world,how to use the rich information in the heterogeneous network and improve the accuracy of community detection has become a hot spot and challenge in current research.This paper uses meta-path analysis as a research method to conduct in-depth research on community detection methods for heterogeneous information networks,including the following:1.Feature representation:In order to reduce the information dimension in the heterogeneous information network,this paper proposes a Behavior-based feature representation(BFR)based on the behavior information of the target node using the meta-path as the medium,which takes the node type to be divided as the target type and the meta-path information as the behavior information,fuses the high-dimensional behavior information of the target node into a low-dimensional fusion factor,and represents the characteristics of the target node in low-dimensional space.2.Similarity measurement:the meta path-based analysis method is an effective method for analyzing heterogeneous networks,and this paper proposes a connection-based similarity measures(CSM)based on link-path similarity measures,which measures the similarity between nodes by fusing the meta-paths between nodes.3.K-means~+:Aiming at the shortcomings of the K-means algorithm relying too much on the initial seed,the seed preset of the original K-means algorithm,and the constraint function is added,and the improved K-means~+algorithm is proposed,and finally the community division of the target node is completed through the K-means~+algorithm.In this paper,experiments are carried out on three heterogeneous information network datasets Iris,DBLP and Cora,and the effectiveness of the BFR algorithm for information dimensionality reduction is verified by fusion f actor variance replacement rate,the effectiveness of the CSM algorithm is verified by the performance comparison between the CSM algorithm and the community detection of a single meta-path,and the efficiency of the improved K-means~+algorithm is compared with the original K-means algorithm to verify the efficiency of the K-means~+algorithm.Finally,the algorithm in this paper is compared with the six latest heterogeneous information network community detection algorithms on three datasets,and the results prove that the BFR+CSM+K-means~+algorithm proposed in this paper has feasibility and effectiveness in the community detection research of processing heterogeneous information networks.
Keywords/Search Tags:heterogeneous network, feature representation, meta-path, similarity measurement, K-means
PDF Full Text Request
Related items