Font Size: a A A

Classifying Interests Of Users Based On Information Content And User Relationship In Sina Weibo

Posted on:2018-06-23Degree:MasterType:Thesis
Country:ChinaCandidate:N CuiFull Text:PDF
GTID:2348330536968492Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of Internet technology and mobile terminals,the social tools represented by Weibo have generated a huge amount of information,and it is difficult to exploit the users interests from the information producedeffectively.Based on the attribute of Weibocontent,using topic extraction as the Main method,this paper builds content theme model.At the same time,considering the characteristics of the information dissemination of social network,the topic model based on the user's follower relationship is established,and the model is combined with the content theme model,so that the user's interest is more accurate and effective.Using the user interest classification model to solve the problem of user interest classification,Sina Weibo is chosen as the research object.The main research work of this thesis is as follows:Firstly,based on the content of the CTM classification model,using LDA model for various types of training data sets of thematic modeling,by using the Gibbs sampling method to calculate the correlation parameters in LDA model,the paper uses the probability distribution on the user's implied topic set to represent the micro-blog text,and obtains the hidden subject-text matrix of the text set,which solves the problem of the large amount of the Weibo text data.This method simplifies the text data and makes a significant effect on the dimensionality reduction and improves the efficiency.Then,the LIBSVM classification algorithm is applied to improve the classification accuracy by combining the LDA algorithm,which has the advantage of semantic information extraction and the LIBSVM,which has good classification ability.Secondly,the paper proposes a FTM model based on the user's follower relation,calculates the average number of the user's followers and each category of followers,constructs the matrix,and classifies the users ' interest by LIBSVM based on the concern relationship.Thirdly,establishedthe CTM model and the FTM model,and a comprehensive model called ZH is developed to classify the users interests.At the same time,from the overall number and three users basic information including gender,academic background and geographical location,and followmanners to analysis the user interest classification characteristics.Finally,the paper summarizes the contribution of the work,considers the existing deficiencies,and conceives the future research direction.
Keywords/Search Tags:user interest, text, followers, LibSVM, LDA
PDF Full Text Request
Related items