Font Size: a A A

News Topic Detection And Dynamic Evolution Tracking Based On Event-time Relation Model

Posted on:2017-04-12Degree:MasterType:Thesis
Country:ChinaCandidate:F L HuFull Text:PDF
GTID:2308330503953789Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology, dissemination and alternation methods of information have completely changed. The network information network becoming the primary means of access to information, as it is growing explosively. However, there are so much mass unordered information we are faced with. So it has become a crucial issue of network information processing in identifying and organizing news topics accurately and intelligently. Topic Detection and Tracking(abbr. TDT) is the research that focuses on these problems. It is aimed at helping people automatically detect new topic and track known topic from the temporal information stream of news media, and present them to people after being organized.This paper firstly proposes a method to detect new event based on the Event-Time Relation Model(abbr. ETRM) for the task of topic detection. To be specific, the ETRM is constructed according to the linked character between event and time property in topics or stories. It describes a topic as the model involving different events that correspond to different time by cutting and extracting the events in the topic rely on its property of time. The New Event Detection(abbr. NED) main task is to mine seminal events of news topic and provide initial centroids for the identification of subsequent on-topic stories, as an important part of the research on topic detection. So this paper explores NED task based on the ETRM, proposing a series of improved strategies in topic detection which including using the time information to establish event index of the topic model and determining new event in view of the principle of “what happens at the same time is one thing” and the attribute of time frequency. What’s more, the strategies refer to adjusting the relativity between story and topic in the light of whether the event is a seminal event to improve the accuracy.Through the result of evaluation, the method based on the ETRM mentioned above improves both the accuracy and efficiency of the system of topic detection.The topic tracking task is also an important part in the research field of TDT. Its main goal is to identify and mine subsequent on-topic stories in the temporal story stream. This paper proposes an improved adaptive tracking algorithm based on the temporal relativity of feature items, according to the news’ characteristics that changing over time dynamically. Firstly, the news topic is described as a vector space model which introduces time information as a feature. Then the temporal relativity is a calculation of time interval between feature items in different document, applying to the traditional cosine angle formula to determine the relevance among stories as an adjustment factor. Additionally, in order to reflect the focus of the topic in a timely and accurate manner, when the feature item self-learning update in topic model, the corresponding weight adjusts at the same time on account of the drift phenomenon happened in the process of traditional adaptive topic tracking. At last, comparative experiments show that the method descripted above can effectively improve the performance of the system.
Keywords/Search Tags:topic detection and tracking, event-time relation model, new event detection, time relativity, adaptive topic tracking
PDF Full Text Request
Related items