Font Size: a A A

Research On The Discovery And Evolution Mechanism Of Implicit Communities Based On The Homogeneity Of Microblog Topic

Posted on:2021-04-13Degree:MasterType:Thesis
Country:ChinaCandidate:Z F SunFull Text:PDF
GTID:2428330626966127Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the constant update of information technology and the rapid development of the Internet,network data has shown geometric growth.Among them,the data accumulated in the process of social networking and information dissemination has skyrocketed unprecedentedly.As a widely used large-scale real-time online social platform,microblog is convenient for users to express their opinions and comments on events or topics they are interested in.Similarly,news websites are extremely important paltforms in the process of imformation dissemination,which is convenient for users to access hot news at home and abroad.With the continuous development of these platforms,huge amounts of data have been accumulated.The statistical analysis of these massive data and the identification of the characteristics of the data,as well as the mining of valuable information,are of far-reaching significance for the application of commodities,information recommendation and public opinion guidance in real life.Research shows that community discovery and community evolution are favored as an important method for analyzing social network data.In the related current research,the research on community discovery and evolution based on text topic homogeneity has not received enough attention.The main contributions of this article are as follows:1.This paper proposes a implicit community discovery algorithm based on the representation of the homogeneity of Microblog topics to realize the discovery of implicit topic communities on Microblog social networks.Firstly,pre-process the MicroBlog corpus and stitch all the blog posts published by each user into one document,and use the LDA model to extract the user's topic features for characterizing Microblog users.Secondly,based on the topic interest,the homogeneity measure between two users is calculated in order to represent the user relationship of the social network.Then,an unsupervised algorithm is used to construct a implicit community with the homogenous relationship of the topic interest to realize the implicit community discovery of the implicit social network.Finally,the data set from Chinese Microblog is used to verify the discovery algorithm of the implicit community.The experiment shows that the silhouette coefficient between implicit communities of the network based on the topic homogeneity reaches 0.94,which is obviously better than that of the network represented by the TF-IDF vector.2.A time-related topic evolution model is constructed to mine the trend of topic evolution,so as to predict the evolution trend of communities based on topic discovery.Firstly,the training subset is sampled from the corpus and the topic modeling is based on the LDA model.Secondly,the corpus is divided into different subsets by time slices.Then,the corpus subsets of different time slices are classified by the trained topic model,and the intensity of different topics on each time slice is calculated to characterize the evolution of thetopic.Finally,Kaggle's news headline data set is used to verify the topic community evolution algorithm.The experiment shows that the LDA topic model can extract document topics well without "crowding" phenomenon,which is better than the LSA model.At the same time,it also verifies that the topic evolution model constructed in this paper can well reflect the strength and weakness change of the topic in different time slices.
Keywords/Search Tags:Topic Extraction, Homogeneity, Implicit Community Discovery, Topic Evolution, Topic Intensity, Community Evolution
PDF Full Text Request
Related items