Font Size: a A A

Research On The Discovery Of Overlapping Community Of Weibo Based On Topics

Posted on:2019-05-04Degree:MasterType:Thesis
Country:ChinaCandidate:L LeiFull Text:PDF
GTID:2417330545952655Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
The rapid development of the Internet has brought Facebook,Twitter,Sina and other microblogging networks into the public life.In the microblogging network,countless users can not only write their own daily experiences on the microblogs,but also can forward the content of other users that they are concerned about.At the same time,each user also has the address,labels and other attribute information.Therefore,microblogging network carries huge amount of information.Through the community detection,users in the network can be divided into different communities according to their hobbies or interactive frequencies and attributes.However,if users' dividing is based on the interests of users and treat each interest as a community,users have more than one kind of hobbies will appear in more than one community,then there appears the overlapping community.This paper proposes a method of discovering overlapping community of microblogging users based on topics of their microblogs,the method mainly includes two parts,one is to discover overlapping community of microblogging users based on the topic distribution,and the other one is to partition microblogging users into overlapping community based on topic propagation.The difference between these two parts is the different researching objects,the former research on microblogging users who have already on the platform while the latter is based on users who are new on the platform.Even though community detection can get multiple communities of all users who have already on the platforms,when new users begin to appear on the platform and we hope to get the new users' community distribution,if we do community detection on all of the new users and whose who are already on the platform,this will not only make the results of the community detection not contribute to later work,but also increase the workload.So this paper proposes community partition,this method will locates new users to the multiple communities got from former results of the community detection on the users who have already on the platform.This paper proposes the method of detecting overlapping community of microblogging users based on the topic distribution,which uses LDA model to get multiple topics of microblogs' contents,and takes each topic as a community,then the users can find thier overlapping community according to the probability distribution of the topics of their long text composed by their microblogs.The work of this paper is different from the traditional work about LDA.In addition to using LDA to assign the words to each theme only,the article also identifies the practical significance of topics according to the meaning of words with larger probability under the topics,such as music,travel,etc.,and no longer according to serial number,so the real meaning of each community can be known,which has practical significance.This paper also proposes the method of microblogging users' overlapping community partition based on topic propagation,which aims to avoid doing community detection on all of the new users and whose who are already on the platform again.To new users,the method constructs words relation network,spreading the topic of the words with large similarity with keywords of new users' microblogs(These users' community distribution is unknown)in the users' micro-blogs(These users are are already on the platform again and their community distribution is known)to the keywords.Finally,the users whose communities are unknown can be divided into different communities according to the topic distribution of their microblogs' keywords.Lastly,This paper uses crawler technology to get data of Twitter users and carry out experiments on these data.The experimental results show that the recall ratio of the method of discovering overlapping community of Microbloggingusers based on the topic distribution and dividing Microbloggingusers into overlapping community based on topic propagation are respectively 75.4%and 84.72%.
Keywords/Search Tags:microblog, topic distribution, overlapping community detection, topic propagation, overlapping community partition
PDF Full Text Request
Related items