Font Size: a A A

Research On Interest Mining Method Based On Weibo User Attributes And Posted Content

Posted on:2020-05-21Degree:MasterType:Thesis
Country:ChinaCandidate:J YuFull Text:PDF
GTID:2438330575996447Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet and a large number of social media,especially the rapid spread of mobile network services,the number of users of Weibo is rapidly increasing.According to the 2018 Weibo User Development Report,Weibo's monthly active users have reached 337 million,an increase of 3.8%over 2017.Since users can quickly post their short speeches and ideas using the Weibo platform,a large number of user groups will generate nearly 100 million data points a day.These data include the user's life experience,professional knowledge,hobbies,novel ideas and more.How to extract valuable information such as user behavior,hobbies,and consumption levels from these users' data is a very meaningful job.For example,in Weibo platform,the most commercial value brought by users'interest mining is microblog marketing and user recommendation.It mines the interest of each user through the network of relationships between people and predicts that a user may be interested.The item and another Weibo user who may be interested in recommending the hot item to each user and grouping the users with the same interests into a group.Therefore,whether it is Weibo platform for users to differentiate the recommended micro-blog marketing of hot items,or to group users with the same hobbies,how to accurately mine the user's interest preferences are the basic tasks of both.This paper analyzes the characteristics of Weibo users for users on the Weibo social network platform.The main features of Weibo users include the content of the user's text,the user's personal attributes and interaction behavior.The useir's interest bias is mainly reflected in the user's published content.If the user is interested in any field,he will be more inclined to publish,forward,and comment on topics in any field.Therefore,according to the activity of Weibo users,this paper divides users into inactive users and active users.For active users,because users are rich in content,they can directly use the user's published content to mine the user's interest bias.However,the content of Weibo's post is a very common short text,with features such as sparsity,high dimensionality,and large data volume.However,inactive users need to make full use of all the features of such users due to the small or non-existent content.This paper has done the following research according to the two types of users divided:1.For active users,this paper proposes a User-LDA topic model combining internal data and a potential feature vector representation of words trained on a very large external corpus to extend the Dirichlet Multinomial Mixture topic model(ULW-DMM for short).The ULW-DMM model can extract a lot of effective information topics of microblog short texts,so as to integrate the user's theme information to mine the user's interest points and the interest areas.2.For inactive users,this paper first mines user interest points from the user's personal attributes and a small amount of published content.Firstly,a method of quantify the user's interest points is proposed,and the user's interest value for the interest points is established.Then,combining the bipartite graph,random walk algorithm and collaborative filtering algorithm,a novel bipartite graph based restricted random walk cooperative filtering algorithm is proposed.The algorithm establishes a user-point of interest bipartite graph by the user's interest value of the interest point;then proposes a restricted user to find a similar user set for the random walk algorithm.Finally,the core idea of collaborative filtering algorithm is used to mine more interest points for inactive users.3.The final experimental results show that the ULW-DMM model has better effects on the topic consistency of the topic modeling of microblog short text compared with the other four similar models,and the task of classifying users in the field of interest.There have been relatively large improvements,showing very good performance.The validity of the model for interest mining by active users is verified.The restrictive random walk cooperative filtering algorithm based on bipartite graph has obvious advantages compared with the other two collaborative filtering algorithms.Accuracy experiments prove that the proposed algorithm has higher quality in finding similar user sets for inactive users.In the partition experiment,the F1 value of the algorithm in the interest area of the cold start user is increased by 27.2%.The effectiveness of the algorithm for interest mining of inactive users is verified.
Keywords/Search Tags:microblogging platform, interest mining, user attributes, posting content
PDF Full Text Request
Related items