Font Size: a A A

Design And Implementation Of Disaster Information Extracting System Based On Microblog Stream

Posted on:2020-02-07Degree:MasterType:Thesis
Country:ChinaCandidate:R ZhengFull Text:PDF
GTID:2428330590476763Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Social media can satisfy people's needs of information and emotion,which will become more urgent when a disaster occurs.Also,social media can provide dynamic and real-time data which are spontaneously produced by users.As the representative of social media,microblog can be used as an important supplement to traditional disaster information extracting methods in both real-time information and emotion.Processing of disaster-related microblog data is a time-sensitive task,as it's users often expect to know the processed data and the processing result as soon as possible.So,considering the streaming nature of microblog data,this paper designs and implements a disaster information extracting system based on microblog stream.This system is aimed at Chinese microblog field,and implements the complete process of microblog data acquisition,preprocessing,information extracting,extracting result statistics,and statistical result visualization.It provides information extracting features on microblog stream data with the support of microblog data acquisition engine and microblog analysis engine.In the aspect of acquisition and preprocessing of disaster-related microblog data,this paper designs and implements a microblog crawling strategy and pre-processing method based on the characteristics of microblog data.At first,crawling disaster-related SinaWeibo data from three sources: user home page data,historical searching result data,real-time searching results data.Then,the data is pre-processed,including data cleaning and Chinese word segmentation,to prepare data sources for information extracting.In the aspect of disaster-related microblog information extracting,this paper proposes a text classification and sentiment analysis method for disaster-related microblog data,which provides model support for the information extracting process.The text classification method is based on FastText model and the sentiment analysis method is based on dictionary.These methods can be used to perform microblog text classification and emotion classification tasks.In this paper,two sets of experimental data were obtained using the microblog data acquisition engine,and the method was evaluated based on the experimental data.Based on the implements of above methods,this paper designs a framework of disaster information extracting system based on microblog stream data and implements a prototype system based on Spark distributed computing framework.The prototype system contains the microblog data acquisition engine and analysis engine.It can be used for text classification and sentiment classification on microblog data.It can also perform time series statistics on classification results and visualize the statistical results.This paper uses experimental data to demonstrate the visualization features of the system.
Keywords/Search Tags:Microblog, stream processing, information extracting, disaster-related information
PDF Full Text Request
Related items