Font Size: a A A

News Hot Topic Discovery And Trend Analysis Research

Posted on:2020-12-12Degree:MasterType:Thesis
Country:ChinaCandidate:X GaoFull Text:PDF
GTID:2438330626953282Subject:Service science and software architecture
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,network news has gradually become the main source of information that the network people are most concerned about.Network public opinion has become a force that can not be ignored.It needs monitoring and guidance.The research on hot topic discovery and trend analysis can discover hot social topics in time and analyze the trend of topics,which is conducive to grasping the trend of public opinion,guiding correctly and maintaining social stability.This paper studies the hot topic discovery and trend analysis of news,which mainly includes the following aspects:1.An improved DPC(Density Peak Clustering)topic clustering method is proposed to solve the problems of text semantic missing and low clustering accuracy in news text clustering.First,the method vectorizes the news text.After text preprocessing,Word2 Vec is used to calculate the word vector.Then the core words are extracted according to the factors such as the frequency of words in news headlines and text,and the word vector of the core words is used to represent a news text.Then,based on the idea of weighted K-nearest neighbor,an improved density peak algorithm is proposed.Improvements are made in local density calculation,automatic selection of initial clustering centers,outlier recognition and sample allocation strategy.Finally,experiments are carried out on eight benchmark datasets and Sohu news datasets.The experimental results show that the proposed algorithm can effectively improve the accuracy of news topic discovery.2.Aiming at the difficulty of hot topic discovery,a hot topic discovery algorithm based on compound attention model is proposed.The topic is measured from the aspects of media attention and user attention,and the composite attention based on the two is used to identify hot topics.Further,the "topic index" is introduced to describe the development curve and analyze the development trend of hot topics.Aiming at the problem of low recognition accuracy in lifecycle stage recognition,an algorithm based on DTW(Dynamic Time Warping)is proposed.Experiments are carried out on training and testing sets constructed by 50 hot topics and real news datasets.The experimental results show that the proposed method can accurately identify hot topics,and the recognition accuracy of hot topics at all stages of their lifecycle can reach more than 83%.3.Based on the above research results,a news hot topic discovery and trend analysis system is designed and implemented.The core modules of the system include fourmodules: news data acquisition,news preprocessing,hot topic discovery and trend analysis,and WEB display.The system realizes each module,which can discover hot topics in time and identify the current lifecycle stage.
Keywords/Search Tags:density peak clustering, hot topic discovery, trend analysis, lifecycle
PDF Full Text Request
Related items