Font Size: a A A

Research On Microblog Emergency Detection Method Based On Multi-feature Fusion

Posted on:2019-04-04Degree:MasterType:Thesis
Country:ChinaCandidate:J WangFull Text:PDF
GTID:2438330569996480Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years,the frequent occurrence of emergencies in various fields,whether natural disaster events or social man-made events have brought great influence on people's daily life and psychological.With the advent and rapid development of the new social media,microblog which is a distributed social media centered on the interaction between users and users,has made the dissemination of information more rapid and widespread,and it has become more informative and more resources for us to dig and explore.How to timely and accurately detect bursty event has gradually become the research hotspot and core based on microblog.It can find sudden social dynamics in time,which has important influence and practical significance to social stability and public interest.This article divides the microblog bursty detection problem into event detection based on bursty words and event detection based on entity features.The characteristics of microblog data were analyzed and extracted from different perspectives.Finally,the time granularity and content granularity were merged based on the similarity.There are breakthroughs in the accuracy of the algorithm and the efficiency of implementation,so that relevant decision-makers can promptly take corresponding countermeasures and conduct timely network monitoring and public opinion guidance to minimize the harm and impact of emergencies on society.The main research work of this paper is as follows:(1)Research on Bursty Word Extraction Based on Multi-bursty FeaturesA bursty word extraction method based on multiple special features is proposed.This method time-slicing microblog data according to time information,calculating word frequency characteristics,topic tag characteristics,and word frequency growth characteristics of each word in each time window,and then combining features based on DS evidence theory and analytic hierarchy process.The fusion method determines the weight of each feature,and finally selects a word set with burst features based on the weighted size of the features.(2)Event Detection Based on Clustering of WordsAn event detection method based on burst feature words is proposed.This method uses the co-occurrence degree and mutual information of the words in the bursty word set to calculate the degree of coupling between the burst words and construct a corresponding coupling degree matrix.The burst word coupling degree matrix is used as the input of the agglomerative hierarchical clustering algorithm to generate a binary tree with burst words as leaf nodes.Finally,a binary tree pruning algorithm with internal similarity is used to classify the clustering results,and the emergent events and burst words generated in the corresponding time window can be obtained.A burst entity detection method based on dynamically divided time window is proposed.The method doesn't need to segment the data,but treats the Chinese characters and English words as a single entity.The entity is divided into corresponding time slices by the message window and the dynamic time window determining algorithm,and then the time-based attenuation characteristic is calculated according to the offset feature and the influence characteristics of the current time slice and the history fusion entity.The entity's burst feature weights,thereby extracting a set of entities with burst characteristics.(4)Event Detection Method Based on Burst Entity FusionA clustering event detection method based on burst entity expansion is proposed.This method constructs the entity message matrix and the message user matrix for the bursting entity,and then based on the relationship between the entity,the microblog user and the microblog text,the combination clustering is used to obtain the entity of the burst.Finally,in the clustering results,the cluster entities are synthesized based on mutual information and left and right entropy to synthesize words or string phrases,and event detection based on entity feature fusion is realized.(5)Fusion method based on event similarity calculation and Weibo mappingA fusion method of similar events was proposed.The main function of this method is to fuse event detection results based on burst words and event detection results based on entity features.The fusion includes two parts: the fusion of time granularity and the fusion of content granularity.Then based on the merged incidents,a similarity-based mapping method is used to extract the microblog messages most relevant to the events as detailed supplemental descriptions of the events.Finally,a microblog incident detection system based on multi-feature fusion is implemented.The system consists of four modules: microblog data acquisition and preprocessing,event detection based on burst words,event detection based on entity features,and fusion of similar events.It can automatically experiment and visualize the microblog data preprocessing,feature extraction and feature fusion algorithms.
Keywords/Search Tags:bursty word, bursty entity, dynamic time window, event similarity, incident detection
PDF Full Text Request
Related items