Font Size: a A A

User Modeling Based On The Latent Topic For Personalized Search Engine

Posted on:2017-02-09Degree:MasterType:Thesis
Country:ChinaCandidate:D X ZhuFull Text:PDF
GTID:2348330488470963Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As a statistical model, topic model is able to dig out the related topics which are hiding in documents effectively, thereupon to map the high-dimensional words to low-dimensional topic space, and achieve the purpose of data dimension reduction eventually. The data which have been reduced dimension could be transformed to information which have better interpretability and could be better understood by users.Since the query word is relatively short and full of noise, the benchmark LDA topic model don't recognize the importance, and the sparsity problem of text data needs to be settled urgently. Therefore, this article is trying to take the query log as long-term data,build user potential topic model, and realize personal search finally.Firstly, this article will take query blog as user's long-term data, and build user document according to clicking document and query words in query blog; Secondly, a new topic model SELDA model is put forward which begins modeling from user plane and user activity level in order to get stronger correlation of words in topics. SELDA topic model is a kind of unsupervised self-learning model which is able to dig out potential topics in corpus automatically, and it is a kind of Bayesian model in full meaning; finally, to train and evaluate SELDA topic model with the help of TMT topic modeling tools.The result shows, in contrast to benchmark LDA modal, personalized SELDA topic model is able to provide better results by introducing updated ranking algorithm of parameter ?, setting parameter and sampling.
Keywords/Search Tags:Personalized Search, Topic Modeling, Query Logs, User Modeling
PDF Full Text Request
Related items