Font Size: a A A

Research On Feature Analysis And Prediction Based On Microblog Retweeting

Posted on:2020-04-02Degree:MasterType:Thesis
Country:ChinaCandidate:C L FuFull Text:PDF
GTID:2428330578983440Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology,more and more netizens tend to establish their own social relationship on social platforms such as Facebook,Sina Weibo,Tencent Weibo and Renren.com.In microblog network,when a user posts a microblog post,other users may retweet his post,then the forwarding process will make the rapid dissemination and diffusion of information.Beneficial information can bring positive impact to society,bad information will make its viral dissemination,and eventually lead to the generation of public opinion and bring harm to society.Researching the users' retweeting behavior and analyzing its propagation law can predict the results of retweeting for a certain microblog post in advance,which is of great significance to the study of network public opinion,advertising,business decision-making and so on.Based on the research and analysis of the rules of microblog users' retweeting behavior in social networks,this paper proposes a comprehensive and novel method,including the construction of the feature sets affecting microblog retweeting,the feature selection model based on Filter and Wrapper,and the microblog retweeting prediction model based on ensemble learning.Aiming at the shortcomings of the single retweeting factors considered in the mainstream method,this paper proposes a feature system construction method that affects the microblog retweeting.Considering that the user's interest is changing with time,this paper proposes an LDA topic model based on interest drift.Considering the close correlation between geographic location and user's retweeting behavior,this paper proposes a similarity based on geographic location.In the past research,researchers rarely consider the network structure and user interaction between users.This paper proposes some features which included user aggregation coefficient,neighborhood overlap between users and the influence of user retweeting.According to the attributes of these features,this paper divides these features into four categories: user features,microblog features,network structure features and interactive behavior features.According to the different types of feature values,they can be divided into discrete and continuous types.Because the large differences of the values between various features,we proposes a method to solve it which uses the maximum and minimum normalization method to map all features to [0,1].As the blindness of the feature selection in the existing research,this paper proposes a relatively complete feature selection model based on filler and wrapper.Because the feature of redundancy and invalidity will not only bring dimension disaster to the model,but also may reduce the accuracy of model prediction.The steps of feature selection are as follows:(1)ANOVA: The value of each feature is analyzed by ANOVA.If the variance of feature is smaller,the information provided by the feature will be less.Then the classification ability of this feature to samples is very small,and it should be deleted.(2)Relevance analysis: The correlation analysis between the feature and whether the microblog is retweeted or not.According to the different types of feature values,the corresponding correlation analysis is used in this paper.For dispersed eigenvalues,this paper uses chi-square test,for continuous eigenvalues,this paper uses point-two-column correlation analysis.According to the significance level,the features smaller than the threshold set in this paper are deleted.(3)Wrapper feature combination analysis: Because there may be strong correlation among the features,it will also cause feature redundancy.To solve this problem,this paper uses LVW algorithm to analyze feature sets,and finally gets the optimal feature combination.In the past,the traditional classification model is mostly used in the microblog retweeting prediction model.In this paper,an ensemble learning method is used to establish the microblog retweeting prediction model.The experiment shows that the method has a high precision and recall rate.In this paper,we analyze the retweeting prediction of microblogs with different topics,and find that the prediction effect of microblogs with political and military topics is the best.At the same time,this paper also analyzed the influence of different types of microblog users on retweeting behavior in micro-blog network,and found that users with greater influence would publish a large number of original microblog posts,but ordinary users are more inclined to retweet micro-blog,especially official users hardly retweet other people's micro-blog.
Keywords/Search Tags:Microblog retweeting, Topic Model, Interest Drift, Feature Selection, Ensemble Learning
PDF Full Text Request
Related items