Font Size: a A A

Monitoring Based On The Micro-blog Burst Incidents Word And Sentiment Analysis

Posted on:2016-11-04Degree:MasterType:Thesis
Country:ChinaCandidate:G L ChenFull Text:PDF
GTID:2308330473965455Subject:Software engineering
Abstract/Summary:PDF Full Text Request
At present, the micro-blog has become the second largest source of public opinion, and it is one of the important sources of traditional media tracking breaking news. The micro-blog plays an important role in the diffusion of emergencies transfer. Many public events, focus and unexpected events will spread rapidly in the micro-blog, known to the public and users, and aroused widespread discussion. So if the topic of the micro-blog could be monitored timely to find unexpected events, public opinion supervision department or policy-makers will be able to intervene at the right time, to fight the crisis at a turning point, as far as possible to avoid incidents transforming into a malignant group events.As a new social media, the micro-blog?s content is dapper but time-intensive, low information content but update. And most micro-blog has a clear emotional tendency. Therefore, short-text-mining, topic-detection and sentiment-analysis are important research direction of the micro-blog. But the intersection between the related research works is little, which are concentrated in their respective fields. The aspects of work could be integrated, not only innovative, but also has great practical value. On the basis of previous studies, this paper is based on Sina Micro-blog as the research platform to monitor the emergency of the micro-blog, mainly from the sudden and emotional color of public opinion events. Specifically, the paper includes the following:1) Data collection from Micro-blog. This paper presents multi-strategy converged data collection from Micro-blog based on the micro-blog API and simulating browser login to crawl the page, to achieve real-time and vast amounts of data acquisition of the micro-blog platform.2) Micro-blog noise filter. In order to improve the efficiency of monitoring, the paper proposed noise filtering method which is based on constructing noise dictionaries and identify the noise of the user. Experiments show that the proposed method will get a low filtering noise micro-blog recall rate but a very high accuracy, which can be used in the pretreatment stage without affecting the extraction.3) Burst topic discovery. This paper put forwards three dimensions to extract burst words, including the opposite word frequency, word frequency growth and sudden weight. Then, using the co-occurrence frequency can calculate the similar distance between the words, and finally using "absolute clustering” to achieve burst words clustering.4) Topics sentiment analysis. This paper built micro-blog dictionary based on the open-source dictionary emotion. Then this paper proposes a method about combining emotion dictionary and semantic rules to sentimentally analyze micro-blogging statement. For micro-blog topic sentiment analysis, this paper combined with the Spread influence of the micro-blog user to calculate the sentiment of the whole topic.
Keywords/Search Tags:Micro-blog, Data Acquisition, Hot Topics Detection, Burst Public Opinion, Sentiment Analysis, Dictionary Structure
PDF Full Text Request
Related items