Font Size: a A A

User Preference Discovery Based On Latent Variable Model In Rating Data

Posted on:2020-11-07Degree:MasterType:Thesis
Country:ChinaCandidate:Z LeiFull Text:PDF
GTID:2428330575489321Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the continuous development of mobile Internet,users have generated a large number of user rating data through the Internet,such as the evaluation of goods by users in e-commerce,which reflects the degree of satisfaction(i.e.preference)of users with goods.It is of great significance to fully mine the effective information contained in the evaluation data generated by users and to provide users with personalized products and services.User evaluation data includes rating data and comment data.Scoring data can reflect users'preferences as a whole,While comment data can express users' concerns and preferences on different aspects of products.In recent years,many studies have used evaluation data to model user preferences,but these methods often neglect the intrinsic relationship between different comment data in rating data,and it is difficult to describe the uncertainty dependence between the score and the related attributes.In this thesis,users'comments on different aspects of the evaluation object are described as different comment attributes.Word vector tool is used to mine the semantic information of users for different comment attributes.Meanwhile,implicit variable model can describe implicit knowledge by implicit variable.So this thesis uses implicit variable to express user preference,and constructs user preference model to describe any form of uncertainty dependence among attributes in user evaluation data.In conclusion,the major contents of this thesis can be generalized as the following three points:1)Aiming at the preprocessing of comment text data,this thesis firstly uses Word2vec word vector tool to transform comment text data into text word vector,and then uses k-means clustering algorithm to optimize the clustering of comment vector,and determines the comment attributes contained in the comment data according to the clustering results.The word vectors of comment text are classified by measuring the distance of word vector space,and the comment data is transformed into the categorized numeric data according to the satisfaction degree of different comment attributes,which is used to construct and infer the model.2)In order to ensure the validity of the model construction,this thesis determines the initial structural constraints from scoring and evaluation respectively,and gives the initial parameter constraints of user preference model combined with the actual meaning of variables.On this basis,the comment data in the original data set is replaced by the corresponding numerical data,and the sample data set is reconstructed.The EM algorithm and the SEM algorithm are used for parameter learning and structure learning respectively,and the user preference model based on constraints is constructed.3)Based on the user preference model,this thesis proposes a reasoning algorithm based on variable elimination method,which estimates the user preference according to the user's evaluation data.At the same time,this thesis also gives the user preference estimation methods in four cases,such as whether the user has comment or not and whether the user has scoring behavior.
Keywords/Search Tags:User rating data, User preference, Bayesian network, Latent variable, Word2vec
PDF Full Text Request
Related items