| With the rapid development of mobile-based Internet,people are accustomed to browsing online news to obtain relevant information.All major online news platforms report social news anytime and anywhere,which makes people face the problem of repeated crossover content,wide variety of topics and hard to filter hot topics.It is difficult for people to get hot topics or topics of interest and to understand the trend change process of topics.Hot topics may be overwhelmed by the large amount of new news.Therefore,how to mine hot topics from lots of online news and to analyse trend changes in hot topics has become an important problem to be solved.This paper focuses on showing people hot topics in different time granularities,and analyzing trends of hot topics,which facilitates people to read and understand news topics comprehensively.The main research work of this paper is as follows:First,A news topic mining algorithm based on compound model clustering is proposed.This algorithm improves the condensed hierarchical clustering algorithm,it can reduce the running time of the algorithm.Then combines the improved condensed hierarchical clustering algorithm with the K-means algorithm.Firstly,an improved condensed hierarchical clustering algorithm is applied to the text sets,the number of news topics and the initial clustering center are automatically found according to the clustering effectiveness evaluation index and the improved maximum and minimum distance algorithm.Then use the K-means algorithm to cluster the text sets to get the final news topic.The experimental results show that the composite model clustering algorithm is better than the traditional single clustering algorithm.Second,A method for evaluating the topic heat is proposed.The traditional TF-PDF heat evaluation algorithm only considers the shortcomings of media attention,so the user participation degree including the news reading amount and the number of comments is introduced,improve the traditional TF-PDF algorithm to evaluate the news topic.The “topic index” is also introduced,which conducts topic mining and topic association in different time slices to analyze the trend changes of hot topics.Experimental results show that the improved TF-PDF algorithm has a better heat evaluation effect.Third,based on the above research content,this paper designs and implements a news hot topic mining system,which mainly includes news text crawling,text preprocessing,hot topic mining and hot topic trend analysis modules,showing the trend changes of hot news topics and trends in hot topics. |