Font Size: a A A

Research On The Method Of Sina Weibo Community Detection

Posted on:2019-06-11Degree:MasterType:Thesis
Country:ChinaCandidate:L GouFull Text:PDF
GTID:2428330566466999Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development and popularization of social networks,the amount of data generated each day is increasing at a rate of billions of dollars.By analyzing the data of users' opinions published on social media,many valuable information can be drawn.At present,data research on Sina Weibo,the mainstream social media,has achieved certain results,but the methodological richness used in community discovery is still lacking.This article aims to study community discovery and propose a new way of thinking.Starting from the keyword of Weibo content,find out the groups with similar characteristics.The main contents are as follows:Firstly,this paper proposes a new data crawling strategy based on Sina Weibo's data distribution structure,user's fan group,user information and other characteristics.Starting from the user ID,selecting the seed user,crawling the user's fan group,data features,Weibo content,interactive information under microblogging and other information,and based on the collected data,proposed data filtering and other methods.Secondly,this paper analyzes the collected data from the perspectives of Weibo's regional distribution,Weibo's release time,and Weibo's interaction.From these aspects,it finds out the features of the users in the data set to use Weibo and understands the users.The habits and interactions on Weibo.According to the microblog interaction,the traditional relationship is replaced by the interaction relationship.The entropy force model is used to draw the social network graph and the interaction characteristics of the dataset are found out.Finally,starting from the microblog keywords,this paper uses the TF-IDF algorithm to calculate the word weights,finds out the representative word weights of each user,and uses the K-MEANS algorithm to cluster all the users,and statistics out7 different types.The groups and analyzed the characteristics and focus of each group.In addition,this paper selects SOM clustering algorithm and MeanShift clusteringalgorithm to conduct comparative experiments,and statistics the results of clustering under different kinds of clustering algorithms,and compares clustering results to determine the status of community classification.
Keywords/Search Tags:Sina Weibo, data analysis, community discovery, TF-IDF, K-MEANS
PDF Full Text Request
Related items