Font Size: a A A

Research On Technology Of Hot Event Detection And Tracking In Internet News Streams

Posted on:2008-04-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2178360212495307Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet, web news has become an important way of information release. People have gotten used to read news in the Internet. On the other hand, users are often lost in such abundant information, and news from multi-sources has different opinions. But the popular classification of news site can't help users to learn the whole event conveniently. So a tool of detecting the hot event is badly needed to find the interesting news.In this thesis, we proposed an integrated system for Internet users to browse the news from multiple news sources. To provide a further intuitive way to search the news, we use the "event" concept as a news grouping method. That means an event with various statements would be put into the same cluster and displayed in the same category for comparison.Firstly, the related work of topic detection and tracking (TDT) and the traditional methods are introduced. We discussed the key questions and improved on indexing structure and abstract.Secondly, a new method combined new event detection with KNN is used to detect and track event in news streams. "Event seeds" was inducted to speed clustering and conquer topic shift problem. A method of scoring event's hot degree was presented to improve the sensitivity to new hot event.Finally, a hot event detecting and tracking system, HEAT, is designed to certify the algorithms we mentioned above. The result reveals that such system can effectively help users realize the cause and effect of events, and acquire the complete information from the multiple on-line news resources.
Keywords/Search Tags:Topic Detection and Tracking, New Event Detection, Event Tracking, Text Abstracting, Indexing Structure
PDF Full Text Request
Related items