Font Size: a A A

Rating Data Analysis And User Preference Modeling Based On Latent Variable Model

Posted on:2018-08-08Degree:MasterType:Thesis
Country:ChinaCandidate:R S GaoFull Text:PDF
GTID:2428330518458875Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Web2.0 applications,such as e-commerce and social media,large and growing volumes of user behavior data are generated in the Internet.It has been studied extensively in the data analysis and knowledge discovery that how to construct user preference model by analyzing these massive user behavior data to provide corresponding support technology for personalized services.Online rating data,representing the user's comprehensive evaluation of events or products,contains the user's latent preference.It is of great significance for personalized services such as personalized recommendation and precision marketing that modeling preference by analyzing rating data for user preference estimation.In recent years,researchers in machine learning and artificial intelligence proposed various methods for modeling user preference by analyzing rating data,but most of the models either cannot objectively describe arbitrary dependencies among attributes in data or have poor interpretability.Latent variable model is able to describe the hidden knowledge by latent variable,thus making model simple and interpretable easily.Recently,Bayesian network(BN)with latent variables,one of latent variable models,enjoy large amounts of popularity in the paradigm of uncertain artificial intelligence and has been extensively applied in uncertain reasoning.Adopting a latent variable to describe user preference,we construct a user preference model representing arbitrary dependencies and uncertainty among attributes in rating data based BN with a latent variable.However,the construction of BN with latent variables is very difficult and complex since the process of construction involves several iterations,each of which concerns NP-hard probabilistic inferences.Therefore,to model preference effectively from rating data based on BN with latent variables,we propose a constraint based method about user preference modeling by analyzing characteristics of BN with a latent variable and rating data and design corresponding parallel algorithm based on distributed memory computing framework Spark.In addition,for the application of the above model,we also give a method for user preference estimation based on model reasoning.The main research contents of our thesis are summarized as follows:(1)We propose user preference Bayesian network(UPBN)to represent arbitrary dependencies and uncertainty among observed attributes and latent user preference in rating data by adopting a latent variable to describe user preference.(2)We give the property that make sure Bayesian network with a variable can fully fit the given rating data by expectation maximization(EM)algorithm.(3)According to the above property and domain knowledge about rating data,we give a constraint-based method to construct UPBN by applying the structural EM algorithm and design corresponding parallel algorithm based on Spark.(4)We propose a UPBN inference algorithm based on a method called variable elimination and give a method for user preference estimation based the UPBN inference algorithm.
Keywords/Search Tags:Rating data, User preference, Latent variable, Bayesian network, Structural EM
PDF Full Text Request
Related items