Font Size: a A A

Recommendation Technique Research On Personalized Microblog Stream

Posted on:2015-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y ChenFull Text:PDF
GTID:2298330434960851Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Millions of people turn to microblogging services to gather real-time news or opinionabout people, things, or events of interest, because it is very convenient and fast for users. Asmicroblogging grows in popularity in recent years, microblogging services are becoming thekey role of social networks to support information sharing. However, more and more userstoday are facing the challenge of information overloading. So, how to recommend theinteresting and important tweets for user is the key issue.The main challenges of the personalized tweets stream recommendation is thecontradiction between the small size of the microblog messages and the social features ofmicroblogs and therefore how to detect personal interest exactly is the key problem in thisthesis. Most of the current researches on the personalized tweets stream recommendationperform badly because of the limitations of models research and the challenges of microblogsdata itself. In this thesis, we integrate the TF-IDF (term frequency–inverse documentfrequency) models of the similarity between streams of tweets and the similarity scoresbetween streams of tweets based on LDA (Latent Dirichlet Allocation) models to solve thisproblem. The research contents are as follows:Firstly, an approach based on TF-IDF is proposed. In order to capture personal interests,the approach improves the TF-IDF model by combining the weight of single terms and pairsof terms and evaluates the similarity between the set of user’s tweets and the stream of tweetscoming to users based on the idea of collaborative filtering. Moreover, the model studies thecold-start problems and personal features of users to optimize the queue of the tweets receivedby users.Secondly, this thesis also borrows the machinery of latent variable topic models like thepopular unsupervised model LDA which have been applied widely to problems in textmodeling. These models distill collections of text documents (tweets) into distributions ofwords that tend to co-occur in similar documents. So the approach uses topics to compute thesimilarity scores between streams of tweets.Finally, this thesis combines the improved models based on TF-IDF and LDA models forpersonalized tweets stream recommendation, and uses precision, correction and MAP (MeanAverage Precision) to measure the effectiveness of the algorithm. The experiments on theSINA BLOBS data show that the proposed method can reduce the ranks of irrelevant tweetseffectively and achieve better performance than several baseline methods based on Cos ineand Hash tags. Microblogs are in essence social networks, even only the content of posts alsohas many features. The next step of our research will consider complex features of microblogsin models.
Keywords/Search Tags:Tweet Recommendation, Information Retrieval, Collaborative Filtering, Personalization, Topic Model, Cold-start
PDF Full Text Request
Related items