Font Size: a A A

Research Of Model And Algorithm Based On Personalized News Recommendation

Posted on:2016-11-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y C YuFull Text:PDF
GTID:2308330461956291Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In the era of information overload and increasingly fragmented, "personalized" push can be said to find an effective channel for the network news to increase user stickiness. At present, the research of personalized news recommendation is concerned by more and more people, Due to there is still a wide gap for the personalized information recommendation service, if we can better mining user’s potential interest and recommend the corresponding news,which can produce greater social and economic value. In this paper, the research of the personalized news model and algorithm can predict news for the user that news users will read more accurately and make the user experience better.At present, the personalized recommendation news has many models and algorithms, the traditional news recommendation based on content which is high similar to the news that have read and poor user experience, but using collaborative filtering method, it is impossible for electronic commerce in accordance with the recommendation method, because the news is continually update, the news classification for unit of collaborative filtering that the default category size is relatively large. So the method of this paper is based on the content and collaborative filtering method, and carries on the two clustering of the news data.The method firstly makes use of data mining technology to sort data information, adopts extraction method based on the semantic of keywords to extract the keywords of each news article and statistics word frequency in the document, then merges two news document keywords, According to the semantic distance between the keywords which will be divided into multiple clusters. And calculates the keyword of word frequency vector in the clustering, using cosine similarity to calculate the similarity of two news document so as to carry on a density clustering. Finally random sampling of m data points as the center each cluster of density clustering to take fast clustering method in all news data. Then this m a small cluster combined into a cluster. This paper based on the secondary clustering method according to the user have read and the news distribution will be recommended, taking into account the news timeliness,heat and other factors, establishing the user model function to recommend.The paper have finished the work and the research of contents :1. This paper studied the TF-IDF that the keyword extraction methods find what ignore the common between the semantics,proposing the calculation methods in the keyword clusters of each news document to determine whether the similarity of document.2. This paper studied the recommendation based on the content and filtering system methods found their advantage and shortcomings, the new method combine content and collaborative filtering, that is to say content-based and collaborative filtering secondary clustering method. In the aspects of content that user consider the history browsing data, in the collaborative filtering for users reading in the past and news will recommend where the clusters establish correlation matrix. Finally, through the matrix decomposition(SVD) method to predict the user’s interest of the recommended news.3. According to the user’s history read and the clustering distribution of the recommended news, taking into account the news timeliness, heat and other factors to establish a function of marking time in the user model to forecast the interest, taking time information into news recommendation.4. In the application of personalized news recommendation, the method in this paper, based on the content of news recommendation method and based on collaborative filtering recommendation method were compared and analyzed to draw the conclusion.In this paper, research of the personalized news model and algorithm, it is recommended better and effectively. At the same time, It does not lead to higher computational overhead, and through the parameter estimation method to calculate the parameters of each factor and effectively realize the cross category recommendation, achieving the purpose of diversification, also extending to the semantic level, so that it has great application value for study.
Keywords/Search Tags:date mining, news keywords, two clustering, user model
PDF Full Text Request
Related items