Font Size: a A A

Research On Automatic Summarization Of Microblog Events

Posted on:2017-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:T CuiFull Text:PDF
GTID:2308330485453698Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In the era of Web 2.0, microblog has been a dominant social-network platform. Many social events spread fast on microblogging platforms, owing to the high interactivity and large number of users of microblogging platforms. Microblog is more real-time and immersive than traditional medias. Browsing hot news topics through microblogging platforms has become an important way for people to get latest news in modern society.However, microblog as a social platform is not designed and optimized for news exploration. Users can only search for interested events on microblogging platforms by keywords, which yielding high redundancy of search results and incomplete descriptions on events (owing to the length limitation of microblogs). In addition, microblogging platforms usually sort results by the release time or popularity of microblog posts, but not by their content relevance or topics. Thus, users can only get part of an event. Meanwhile, for an event (especially for events of which focuses transfer overtime), people want to know not only an overview of the event, but also the evolution, causes, and the effects of the event.Aiming to solve these problems, we propose a new approach to automatically generate event abstracts for microblog events. This approach contains two parts. Firstly, to solve the problem that how to describe the microblog event, we design an automatic summarization algorithm for microblog short texts. This algorithm can overcome the shortcoming that traditional text summarization algorithms are not suitable for microblog short texts, and generate a summary from a global view of an event. Secondly, to solve the problem of generating a summary for event evolution, we propose an algorithm based on evolutional features. After that we generate event evolution summary using the automatically summarization algorithm. In this paper, the main contributions are as follows:(1) To solve the problem of how to represent the microblog event, we design an automatic summarization algorithm for microblog short texts. We summarize an event via two aspects, namely the event-description set and the user-emotion set. To extract the event description of an event, we first partition microblogs into a set of sentences. These sentences are further ranked by computing the relevance to the event based on a graph model and selecting the most relevant sentences as the set of event description. To extract user emotions of an event, we propose a supervised learning model which based on the set of sentences extracted from microblogs. We conduct experiments on a data set consisting of 6 events crawled from the SinaWeibo. The experiment results suggest the effectiveness of the proposed approach.(2) To solve the problem that an abstract cannot describe the events with complex evolution process, we propose a hierarchical clustering algorithm based on evolution features to identify the stages of event evolution. Considering that the popular cosine similarity of Vector Space Model ignores the sorting relations between words, we use Spearman correlation coefficient to calculate the text similarity which takes consideration of words orders. After that, we generate summaries for every event evolution stages by the automatically summarization algorithm we designed. All the evolution stage summaries comprise the whole event evolution summary, including the evolution process, causes, and effects of the event. Experimental results show that our method can accurately identify event evolution stages. The generated event evolution summary has good accuracy and readability.
Keywords/Search Tags:HITS, automatic summarization, evolution features, clustering algorithm, ROUGE evaluation
PDF Full Text Request
Related items