Font Size: a A A

Research On The Algorithm Of Hot Event Detection And Tracking Based On Network News Flow

Posted on:2019-04-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y J QiFull Text:PDF
GTID:2428330548976395Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology,the Internet has brought great convenience to people's lives.The Internet has become an indispensable part of our daily life.Faced with the complex news on the Internet,it is hard for people to find news events that they are interested in rapidly,and also can not get timely follow-up news reports related to their interest events.Therefore,how to quickly discover social hot events from mass network news and help users find relevant news has become a hot research topic.In this paper,we take the online news flow as the research object,and adopt the technologies of topic detecting and tracking,and design two algorithms to detect hot events and track events from the news flow.Text clustering algorithms can be used to detect hot events and organize news by clustering text sets.However,most of the text clustering algorithms are static,and the static clustering algorithm will recluster all the objects in the data set with high time complexity.The news flow to deal with in this paper is a dynamic data set,the idea of incremental clustering is used to detect events from the news flow.This paper studies from the following aspects:(1)The construction of news and event model is studied in depth.News and events contain some key words which are closely related to their themes.In this paper,these keywords are used as feature items to construct vector space models of news and events.When news is put into an event,we adjust the weight of the feature of event vectors,so that we can dynamically reflect the development process of an event,which is suitable for online hot event detection.(2)In this paper,a hot event detection and tracking algorithm based on the weight of feature terms is proposed.The word segmentation extraction tool NLPIR extracts the related keywords from the news,and constructs the news vector model with these keywords as the feature item.Every time a news comes in,it matches the events in the event library in order to obtain the maximum similarity.Then,the similarity is compared with the given threshold value,if greater than the given threshold values the news is put into existing events and adjusts the weight of the feature item of the event vector,the news is stored in the event library as a new event.Experiments show that the algorithm proposed in this paper can detect some hot events more effectively.(3)In this paper,a hot event detection and tracking algorithm based on the trend of feature growth is proposed.Through the study of the distribution characteristics of the weights of feature items over time,find that the weights of feature items that are closely related to the subject grow faster with time.Therefore,we can use the growth trend of feature items to reflect the heat of feature items.According to this feature of feature items,a new similarity algorithm is designed.Through the analysis of the experimental results,it shows that the proposed algorithm can effectively detect some hot events.
Keywords/Search Tags:Network News, Event Detection, Event Tracking, Attenuation Ratio, Growth Trend
PDF Full Text Request
Related items