Font Size: a A A

Research And Design Of Online Hot Newsrecommendation System

Posted on:2016-08-15Degree:MasterType:Thesis
Country:ChinaCandidate:S W XuFull Text:PDF
GTID:2308330479476633Subject:Safety science and engineering
Abstract/Summary:PDF Full Text Request
With the development of information technology and Internet, people have gone from the era of lack of information into the era of information overload. The way of reading news has changed from the traditional mode to access thousands of news on the Internet with the advancement of Internet. At present, the Internet produces a large number of news data in every day. News aggregation sites such as Google and Baidu news gather news from many other web sites. For these sites, how to recommend news which users are interested in becomes the key issue.In this paper, we research with the main domestic and foreign Chinese News websites. Aiming at a great deal of news in these websites, we complete collection of the news text and design and implementation of hot news recommendation system. The main work and contributions are as follows:(1) First, we collect news content in main domestic and foreign Chinese news websites, including news headlines, news links, news publishtime, news content, news sources and news’ forum. In this paper, we discuss the news data acquisition and preprocessing system based on the Hadoop distributed platform. The system will store the news data into Hbase, which provides the data source for processing and analysis.(2) For the hot events, various news media will reporte. Therefore, hot news headlines from different news websites will have some similarities. Based on this feature of news headlines, we put forward the hot news recommendation algorithm based on headlines. First of all, we segment headlines into some keywords and preprocess the keywords, and then use the naive Bayesian model and SVM for text classification of news. Finally, classification results are the recommended content. The experiment results show that method by using the naive Bayesian is superior to SVM method. The recommendation accuracy of top 100 of hot news by using naive Bayes model can reach 92.5%.(3) In this paper, we discuss the defects of hot news recommendation algorithms based on headlines, and further put forward the hot news recommendation algorithm based on text summarization. Firstly, we extract the news summarizations by using Text Rank and complex network partition method. Then, in the hot news recommendation algorithm, we use text summarization of news instead of news headlines. Finally, we use the naive Bayesian model and SVM for text classification of news, and recommend the classification results. The experiment results show that The recommendation accuracy of top 100 of hot news by using naive Bayes model based on text summarization can reach 94%. This shows that the hot news recommendation results based on the summary are more accurate.(4) In this paper, the news recommendation system which realizes by hot news recommendation algorithm based on text summarization has started to implement for government officers in Hangzhou in 2014 March and reflect is very good.
Keywords/Search Tags:Recommendation System, Chinese Segmentation, Text Summarization, Text Classification, Naive Bayes, SVM
PDF Full Text Request
Related items