Font Size: a A A

Temporal Summerisation Based On The Event Topic Mining

Posted on:2016-02-06Degree:MasterType:Thesis
Country:ChinaCandidate:F YaoFull Text:PDF
GTID:2308330503450603Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the approach of the big data era, users are submerged in the sea of information when facing the awkward situation of “Big Data, Thin Knowledge”. For example, when an event happens, different information source, different viewing angle and different developmental stage of event produce large scale of news report.This situation make it difficult for users to get comprehensive, correct and novel news information. Temporal summarization technique combines topic detecting technique and information mining technique. It can help people grasp news and trends timely.So this technique attracts many researchers’ attention.In this paper, the temporal summarization based on event news is the main work.By analyzing the problem of traditional technology, include the insufficiency of the content’s analysis, high redundancy of information and singleness of the structure of summarizations, this paper proposes two main algorithms: the event analysis technique based on semantic space mapping and the topic clustering based on data gravity. According to the TREC result, our method can improve the performance of summarizations infinitely.The main research content is as follows:Firstly, based on summarize of automatic abstracting field and the character of the mission, this paper proposes the designing scheme of the system and the flame to build summarizations. Then this paper introduces the four modules: information retrieval, event analysis, topic clustering and summarization extracting.Secondly, different from traditional technique based on time element to divide text, this paper proposes the method based on semantic. The goal is to comprehend news event from semantic angle because of improving the quality and the accuracy rate. The method maps the text in high-dimensional space to the data in low-dimensional semantic space to analyze the development process of event.In order to reduce the information redundancy, this paper proposes topic clustering method based on data gravity. Using universal gratitation for reference, the method discovers that the degree of data’s attraction is similar as the universal gratitation. Based on this character, the clustering regulates the radius dynamic to improve precision of clusters.Finally, this paper evaluates the system by Temporal Summarization Track inTREC 2014. the experimental results show that the system can improve the Expected Gain and Comprehensiveness better than other teams. Especially in TREC 2013 our team receives the invitation to make a 25 minites presentation in USA because of the high score in Value Track. In TREC 2014, our team gets the second place all over the world.
Keywords/Search Tags:temporal summarization, data dimension reduction, topic clustering, information extraction
PDF Full Text Request
Related items