Font Size: a A A

Research On Personalized Information Recommendation Algorithm Based On Short Text Mining

Posted on:2018-11-14Degree:MasterType:Thesis
Country:ChinaCandidate:P ZhangFull Text:PDF
GTID:2348330518489470Subject:Information management
Abstract/Summary:PDF Full Text Request
Alongside the development of Internet technology, the amount of data increases at the BT level. Unlike the data is stored in a structured form on Data Mining, the short text information is stored in the unstructured or structured form.How to extract information from these data for their own purpose has become the goal for the people.It is necessary to have an effective technique to extract the valuable information or knowledge contained in the short text data and to make personalized pushes according to the user's own needs. This can save the time of search in the massive data,and improve the efficiency and accuracy of information search.Through the study of existing literature, (1)this paper summarizes the concepts and processes of current text mining and short text mining technology, sums up the characteristics of short text data, and puts forward the idea of integrating user comments with information itself, with a view to achieving the feature extension method aimed at the sparse characteristics of short text; (2)it studies and analyzes the topic model in the topic mining, and puts forward the UCBTM for analysis and mining from the user's point of view without the aid of the external corpus.:First,the user's short text information and its corresponding user comments are integrated into"preliminary long text" to obtain the "preliminary long text" document collection.This addresses the issue of feature sparseness to a certain degree; and then K-means clustering algorithm is used to aggregate the "preliminary long text" with close topic in the "preliminary long text" document collection into one cluster; thus,all the"preliminary long text" information in each cluster is integrated into one document to solve the problem of feature sparseness. Finally, it makes modeling for the symbiosis model for the word in each document according to the user topic in the document collection, obtaining the topic distribution of the entire document collection and the topic distribution in each user. (3)Then, this paper studies the PageRank algorithm and the user influence algorithm, and puts forward the influence model of the microblogging user based on the features of Sina Microblogging data. Based on PageRank algorithm, it proposes the user influence algorithm UserRank for social networking platform. (4)Finally, this paper proposes user personalized recommendation algorithm based on the topic mining and user influence algorithm UserRank. For starters, it calculates user topic similarity based on the user-topic distribution and topic-word distribution in the UCBTM model. Then, the UserRank calculates the user's influence, and finally calculates the recommendation degree value of one user for another user expected by the two. It is ranked according to the recommendation degree value to obtain the user recommendation list. (5) this paper carries out the experimental analysis of the UCBTM topic model and the personalized recommendation algorithm by using Sina microblogging data as experimental data, and compares it with other algorithms and models. The conclusion proves that UCBTM improves the quality and efficiency of topic mining, and personalized information recommendation algorithm boosts the user's satisfaction.
Keywords/Search Tags:topic model, personalized recommendation, user influence value, short text
PDF Full Text Request
Related items