| With the growth of social network platforms and the increase of social network users,users’ social activities on social networks leave a huge amount of data.However,each social network is independent of each other,forming a phenomenon of ‘islands’ of user information.Cross-social network user identification is to process and analyze user data in different social networks to identify the same identity of the user in the real world,thereby integrating the user’s data resources in various social networks,and establishing a comprehensive user profile.Cross-social networks are of great significance for commercial advertising,friend recommendation,and the maintenance of network security.At present,research on cross-social network user identification algorithm has made some progress.However,the algorithm only using user topology structure information still has problems of low accuracy and high time complexity,because the prior seed node information is not comprehensively used and using the common neighbor indicator calculate user similarity leads to controversial seed nodes(that is,users of one network have the same matching similarity with multiple users in another network).Aiming at these problems,this thesis proposes a two-stage user identification algorithm based on user topological structure information in dynamic community clustering.In view of the low accuracy and high time complexity of traditional algorithms based on user topology information,some scholars introduced community clustering methods,which improved the accuracy to a certain extent.However,there are still problems of insufficient utilization of prior seed node information and poor recognition effect caused by the fixation of clustering communities,and the time complexity is still high.In light of this,this thesis firstly conducts community clustering for different social networks,calculates the similarity between different social networks according to the number of common priori seed nodes,and screens out the community pairs with high similarity.Secondly,the users of different community pairs with high similarity are matched.Finally,the matched nodes are added into the user pair of seed nodes after two-way matching.After the above process is completed,the clustering number of different communities is reset(Decrease the number of communities by a certain series,that is,dynamic community clustering),then the dynamic community clustering is carried out,and the users in relatively similar communities are matched.The iteration is repeated until no new matching user pairs are generated or the set number of iterations is reached.Dynamic community clustering searches communities from different perspectives and finds more matching user pairs,which improves the accuracy of user identification and reduces the running time of the algorithm.The experimental results show that the average accuracy of the proposed algorithm is 31.7% and 26.5% higher than that of the global matching algorithm based on user topological relationship and the algorithm based on hidden tag node.The running time of the algorithm is reduced by 86%and 66.5%.In order to solve the controversial seed node problem caused by user similarity calculation,in this thesis,the closeness of nodes is introduced to calculate the average distance of the shortest path between nodes to be matched and other nodes,and the similarity of users to be matched is calculated by combining the proximity centrality of nodes and common neighbors.The experimental results show that compared with the traditional method based on common neighbor indicator,the introduction of node proximity centrality can improve the user identity recognition effect to a certain extent.To improve the accuracy of user identity recognition,Resource Allocation(RA)indicator and Adamic-Adar(AA)indicator are introduced to calculate user similarity based on the two-stage dynamic clustering recognition in this thesis.The experiment proves that RA indicator can identify the same user identity more effectively.Figure[30] Table[8] Reference[79]... |