Font Size: a A A

A High Influential User Discovery Algorithm Based On User Behavior Analysis In Social Networks

Posted on:2019-04-20Degree:MasterType:Thesis
Country:ChinaCandidate:B J ZhangFull Text:PDF
GTID:2428330578472762Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,the rapid emergence of social applications continues to meet people's social needs,providing data support for research work.As a leading social media platform for providing with microblogging services,Sina Weibo has attracted large amount of people.With this service,people can freely propose their ideas,comment others and forwarding interesting news.The high-influential users in microblog often play a vital role in the information propagation in social networks.The identification of these high-influential users hidden in social networks is of great importance in public opinion analysis and corporate marketing.In order to solve the above problem,we design an algorithm to identify high-influential users by analyzing user behaviors in social networks.Finally,we visualize the distribution areas of high-influential users based on Baidu map.The main work of this thesis is summarized as follows.Firstly,we introduce the concepts of social media such as Sina microblog,complex networks and related theoretical methods,which lays a theoretical foundation for our study.We analyze user behaviors in microblog network,select user feature attributes based on user behavior,eliminate zombie fans,adjust user feature attribute values according to the concept of homogeneity in sociology.The similarities between users and their fans are calculated based on the Word2vec model.We design a formula to adjust the weights of each feature according to the value of similarities,which is consistent with sociological laws.Secondly,Canopy and K-means hybrid clustering algorithm are adopted to get the groups of users.The number of clusters(k values)is obtained according to the Canopy algorithm.The clustering algorithm greatly reduces the range of finding high-influential users.The candidate clusters that meet the characteristics of high-influential users are selected,which greatly reduces the time complexity and improves the efficiency.Then the behaviors of followers and followees are analyzed to measure the influence of them.We propose an algorithm(HIUD)to identify the most influential users.In order to evaluate the performance of our algorithm,we implement it on the real world data sets including Sina and Tencent.The top 15 high-influential users are selected as the experimental results.We compare the ranking results with those of other algorithms.It shows us that our algorithm is of more practical significance.Through visualization the top 15 high-influential users' distribution on Baidu map,we have obtained the geographical distribution characteristics of high influential users.Finally,the potential future work is discussed.
Keywords/Search Tags:Social network, Sina microblog, High-influential users, Word2vec, PageRank
PDF Full Text Request
Related items