Font Size: a A A

Research On Topic Diffusion In Online Social Networks

Posted on:2016-09-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:X WangFull Text:PDF
GTID:1318330536967207Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Web 2.0 technologies,online social networks have been important places for people to gain,publish and diffuse information.Online social networks have made the world more flat.Online social network applications not only help people to transform their relationships of the real world into Internet,but also make the relationships between users more closer.The relationships are relations between classmates,colleagues,friends and so on.The information about events,activities and so on in the real world is talked about in social network as topics.The topics propagate very quickly along the social relations between users.Besides being a channel of information propagation of the topics in the real world,online social networks also influence the development of the related events,activities of the real world.Therefore,research on topic diffusion in online social networks is very important to the public security of our society.Research on topic diffusion in online social networks faces great challenges: the text is very short;the semantic of the text is hard to understand;the number of micro-blogs is huge;the structure of the social networks are complex;information are spreading very fast in online social networks.Our works are based on existing researches.The four research points are topic text representation model,predicting the popularity of topics,detecting online paid posters of promoting campaigns and topical interests of online users.The main research contents and contributions are as follows:(1)In the research of topic text representation model,we proposed a new text representation model based on concepts in external knowledge base for short text in online social network.Traditional “Bag of Words(BOW)” text representation schema is based on word co-occurrence and ignores the semantic relations between words.In online social networks,the text is very short and non-standard,so there are rare word co-occurrence and the BOW model is not suitable for text representation in online social networks.Wikipedia is shown as an example of external knowledge base in our research.Based on the inverted index built from the contents of Wikipedia articles,a document text can be represented as Wikipedia concept vector instead of word vector.In Wikipedia,concepts are linked with hyperlinks which show the semantic relationships between them.We construct a semantic matrix using WLM method which is based on Wikipedia link structure of concepts to enrich semantic relationships between concepts.The enriched Wikipedia concept based text representation can be used in text classification and clustering.Extensive experiments on the text datasets for classification show that our model outperforms the traditional BOW text representation model.(2)In the research of predicting the popularity of topics,we propose a method to predict the popularity of topics based on user's historical sentiment.Traditional methods are trying to predict the popularity of online content based the popularity of its early times.We try to predict the popularity of topics which have not appeared yet.We compute the total sentimental energy of a community on a topic based on the users' sentiment on the topic using the Markov Random Field(MRF)model and graph entropy model.Our experiments find the linear correlation between the total topical sentimental energy and the real popularity of topics in a community.Based on the finding,we proposed two models to predict the popularity of topics.Experimental results show the effectiveness of the two models.(3)In the research of detecting online paid posters of promoting campaigns,we try to find the promoting campaigns in which the messages propagate unnaturally.Traditional methods which are based on the individual characteristics ignore the group characteristics of paid posters.We study the group characteristics of paid posters and propose a method to detect them based on the individual and group characteristics in online social networks.Extensive experiments on three real SINA Weibo datasets show that our method is better than existing methods.Furthermore,we study and verify the group effect of paid posters.Experimental results show that most paid posters belong to a few communities and most of them in a community hold the same viewpoint.We also find the promoters based on the detected paid posters.We find that most paid posters try to promote micro-blogs from few promoters.(4)In the research of topical interests of online users,most existing researches are based on the content they post.The computation of these methods are too high to be used in online social networks with hundreds of millions users.In SINA Weibo,we find that only 21.8% of all users tag themselves with personal interest tags and other 78.8% of all users do not tag themselves.We proposed a method to mine personal interests of all users in online social networks based on the self defined interest tags of some users.Our hypothesis is that if a user retweets from someone or mentions someone,then we think that they share the same personal interests.We construct the interactive graph based on the interactive relationships between users.Random walk model is utilized in the interactive graph to mine users' personal interests based on the self defined interest tags of some users.We also rank the interest tags of a user.Experiments on a dataset with 140 million users in SINA Weibo show our method performs better and faster than exiting researches.In summary,this thesis presents some technical solutions to several essential issues of topic propagation in online social networks.Experiments on real datasets show that our methods properly achieve their goals.It is significant to the theoretical research and practical applications on topic propagation technologies in online social networks.
Keywords/Search Tags:Social Network, Topic Diffusion, Text Representation Model, Paid Poster, Interest
PDF Full Text Request
Related items