Font Size: a A A

Research And Implementation On Event Correlation Analysis Method For Microblog Platform

Posted on:2015-07-02Degree:MasterType:Thesis
Country:ChinaCandidate:J LiuFull Text:PDF
GTID:2348330509960907Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the widespread use of new online media represented by microblog, a lot of social events were spread through the microblog platform. On the one hand, the microblog platform expands the channel to obtain information, which brings great convenience to people's life, On the other hand, some of the criminals hype the sensitive event by using microblog platform, that destroies social stability and solidarity. In this paper, the correlation of microblog event was researched based on bipartite graph against the threaten of malicious speculation of microblog event. Then this paper got the quantitative calculation of the correlation of the microblog event. On the basis of correlation result, we found the common promoters behind the events. The following works had been done in this paper:Firstly, Because of the importance of new word detection for Microblog text processing,this paper focused on the importance of new word detection in microblog text, Combined with the structure characteristics of microblog text, we presented a algorithm by using the topic(the words between ‘#' and ‘#' in microblog text) as the candidate new words based on the existing algorithm. Then,the correctness of the algorithm is verified by the experiment.Secondly, in order to detect events in a huge amount of microblog data, this paper presented an event detection algorithm based on text similarity clustering. Due to the short text features of microblog text, we used the information entropy of the words to improve the text vector space model, and made the vectorization of microblog text more accurately. Based on the real-time and continuity of microblog data, we used incremental clustering algorithm to improve the efficiency of clustering. By analyzing the features of how to describe the microblog event, we presented an event description model using four-tuple which contains time, address, characters and actions. Then, we used the named entity recognition and part-of-speech tagging method to collect the elements of the four-tuple model.Thirdly, on the basis of the upfront work, we extracted the event and users of each cluster to form the set of the event and the set of the users. According to the role of microblog users in microblog event, users' general weight was given to construct the weighted bipartite graph of microblog event set and microblog users set. By comparing a variety of bipartite graph projection algorithm, this paper proposed a one-mode weighted projection algorithm, and came to a quantitative representation of the event correlation without losing the information of bipartite graph. Then we extracted the common users of the event which had a larger correlation value to find the common promoters behind the event.Finally, we gave the modular design and implementation for each algorithm proposed in this paper, and formed a system of event correlation analysis for microblog public opinion analysis.
Keywords/Search Tags:Microblog Event Correlation Analysis, New Word Detection, Short Text Clustering, Event Detection, Bipartite Graph Projection
PDF Full Text Request
Related items