Topic Model For Mining Multi-domain Data And Application

Posted on:2017-11-03

Degree:Master

Type:Thesis

Country:China

Candidate:X Y Chen

Full Text:PDF

GTID:2428330590991580

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

With the rapid development of the Internet,people can easily feel more and more convenience it brings.The most striking point is making information sharing and spreading much easier than ever before.However,the extremely convenience of get-ting information from the internet also has its problem,namely,information overload.People often shopping online,moreover,make friends,watch videos and browse news on the Internet.In most cases,people have to be interrupted by valueless or even dis-gusting information.Therefore,when people want to find interesting information,such as required goods,favorite movies,like-minded people and so on,the interruption will damage user experience.With the help of search engine,valuable items can be found quickly.But sometimes people don't clearly know the requirement,recommender sys-tem is proposed to handle this problem.Nowadays,people's needs are diverse.For example,if someone likes a movie adapted from a novel,he will likely to read the novel.Moreover,people are interested in a news shared by their like-minded friends.In this paper,we will analyze the data from multi-domain datasets and then find its meaningful application in the real world.My paper will focus on the research of mining and clustering data from multi-domain datasets and explore the application in cross-domain recommender system.In this paper,we propose a novel algorithm called OVCLDA which can address and analyze multi-domain datasets.It can discover the latent topics cross domains and cluster terms based on these topics.Our model can compute not only the multinomial distribution of words over topics but also the multinomial distribution of words over corpus IDs.In this model,latent features can be shared across domains and each domain can also hold its specific features.Online learning is also introduced to this algorithm so that it can handle streaming data.The efficiency of the algorithm is much better than Gibbs sampling which is based on Markov Chain Monte Carlo.In a word,our algorithm is very useful in this big data era.Topic model has already demonstrated its ability of analyzing textual data.It can be inferred that the variants of LDA are also competent at the same task.However,in recommendation problem,rating data is commonly used to train a model and then predict the missing rating.One of the widely used approaches in recommender system is latent factor model(LFM).LFM is essential similar to the topic model because both of them can discover the latent features and relations behind the surface.They use lower dimensional features to represent the high dimensional data.We will try to improve the existing matrix factorization algorithm and propose DSSCM to deal with the cross-domain recommendation problem.Since it can use data from various domains,it can help alleviate the data sparsity of the traditional recommendation problem.Finally,we have done a lot of experiments and the results achieved expectations.Thus,the topic model and its related model,LFM,can be used to analyze discrete data from multi-domain.It cannot be denied that these approaches are very promising in the future.

Keywords/Search Tags:

Multi-domain data, Topic Model, Latent Factor Model, Clustering, Recommendation Problem

PDF Full Text Request

Related items

1	Research On Collaborative Filtering Recommendation Based On User Clustering And Latent Factor Model
2	Research On Personalized Recommendation System Based On Latent Factor Model
3	Research On Recommender Algorithms Based On Latent Factor Model
4	Recommendation System Research Based On Latent Factor Model
5	Research On Personalized Recommendation Algorithms Based On Latent Factor Model
6	The Research Of Personalized Recommendation Algorithm Based On Latent Factor Model
7	Research And Application Of Recommendation Algorithm Based On Latent Factor Model
8	A Personalized Weibo Message Recommendation Algorithm Based On Topical Social Influence
9	Research On Multi-View Topic Model And Application
10	Collaborative Topic Regression Recommendation Algorithm With LF-LDA And Social Relationships