Font Size: a A A

Research Of Generating Tags For Microbolg Users Based On Content

Posted on:2015-02-26Degree:MasterType:Thesis
Country:ChinaCandidate:X D SunFull Text:PDF
GTID:2348330509460672Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years, The microblog application such as Twitter and Sina Weibo has been rapid developed. It not only attracts a large number of users, but also accumulates vast amounts of user data. As a emerging media, the general public has been gradually cherished. It makes the study has great commercial value and social value by digging out the users' characteristic from their data, providing personalized service for the users of micro-blog, and offering recommendations for enterprises and government departments. In this paper, we found the characteristics of micro-blog users and added a tag for them, using the content published by micro-blog users as the input data with the help of the micro-blog keyword extraction, micro-blog user modeling and classification technology.In the thesis, the technologies of keyword extraction, user modeling and classification have been studied. The main work as follows:First of all, on the microblogging keyword extraction, according to the characteristics of the micro-blog short text, using TF-IDF and Text Rank to calculate the weight of words in microblog; Then converted the users' microblog into vector space model, used clustering algorithm to extract candidate keywords; After used n-gram model to expand candidate keywords; Finally, according to accessor variety and semantic number of units selected keywords. That extracted the key words of content efficiently.Second, on the user modeling, the micro-blogs are divided into such different kinds as the original micro-blogs, tweeting micro-blogs, topic microblog and theme microblog. Users had added tags microblog as a reference, found different users microblogging different characteristics in terms of performance. The results show that topic and theme microblog are better to exhibit the characteristics of users. Based on this proposed presented a user modeling method based on the type of microblog. Strengthened the topic and theme microblog's weight in the model. Modeling results has been effectively improved.Finally,on the generation of tags for users, used the microblogs of Sina platform's official certification accounts as raining data,used keyword extraction method and user modeling scheme of this paper, established tags model for official certification accounts and user model for user who are waited to add tags. Used support vector machine(SVM) classification function, generate tags for the user.
Keywords/Search Tags:Microblog, Tag of User, TF-IDF, Text Rank, Clustering Algorithm, Support Vector Machine
PDF Full Text Request
Related items