Font Size: a A A

Algorithm Based On Improved Naive Bayesian For Predicting Weibo Behavior

Posted on:2020-09-19Degree:MasterType:Thesis
Country:ChinaCandidate:J X YuanFull Text:PDF
GTID:2428330620454837Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As the dramatic development of Internet,the data and information resources on the social platforms show an explosive increase trend,it's difficult for users to quickly search the relevant results from enormous quantity of Web data.Studying Weibo user behavior is regarded as hot topic today.The user behavior and content are complex and diverse,the number of Weibo behaviors of most users is low;the behavior of most users is consistent;the expressions of different users are different,and the scopes of attention of the different user are different.A lot of researches cannot fully grasp the user information and make full use of the relevant information,and the prediction accuracy rate needs to be improved.Therefore,how to improve the accuracy of predictive Weibo behavior and analyze users' content on the whole is the hot topic now.The feature of Weibo data is as follows when we research: most Weibo user's behavior number is zero,not all of part user's behavior number is zero.User's behavior numbers show power-law distributions in general,every user's behavior numbers obey clusters approximately.The traditional Naive Bayesian algorithm and Logistic Regression algorithm don't include the relation of every feature word,when calculating the result,them ignore the feature of every behavior content.And they doesn't include synonymy and user's idiom.This article is aimed at Weibo content and its three kinds of behaviors: forward counts,comment counts,and like counts,analyzing the overall characteristics of Weibo,we proposed an improved naive Bayesian behavior prediction algorithm.Use jieba to demarcate words and calculate Weibo keywords based on TF*IDF.Combine LSI algorithm to obtain the feature word that has synonymy,then select the high frequency feature words.We classify Weibo by LDA algorithm,to find the right category set.The hierarchical structure of the constructed object can be predictor of the improved Naive Bayesian algorithm and improved Logistic Regression algorithm,the hierarchical structure includes: user and its behavior average values,common attributes,key attributes,emotionally tagging the test set of Weibo,with positive or negative words being easily noticed judgment basis;combined with predictors,we predict the values of users' behavior.Experimental results demonstrate that the improved Naive Bayesian algorithm and improved Logistic Regression algorithm have better predictive effect than predecessor's predictive algorithm.
Keywords/Search Tags:Weibo behavior prediction, Emotion label, Improved Naive Bayesian algorithm, Improved Logistic Regression algorithm
PDF Full Text Request
Related items