Font Size: a A A

Detection And Tracking Of Accident Events Based On The BBS

Posted on:2016-10-17Degree:MasterType:Thesis
Country:ChinaCandidate:K Y WangFull Text:PDF
GTID:2308330461467269Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Along with the development of Internet technology, people have entered the digital information era of big data. Every day hundreds of millions of data generation, transmission. These very large scale data brought unlimited business opportunities at the same time also brought the corresponding risk. Information produced by various events in real life, and spread with the passage of time and place. The events in real life become the topic of discussion together because of their causal relationship. In recent years on topic detection and tracking (TDT) has been the research focus of the relevant academic scholars. Based on emergency topic as the research object, this paper will use news forum data as data sources, for emergency topic detection and tracking. This paper established a database table that meet the emergency requirements of the topic characteristics to store the original data. According to the characteristic feature of the emergency topic extracted content information and time information from original data. In order to get content information of the data. This paper based on open source word segmentation tools, then custom segmentation dictionary and build the corresponding stop word dictionary. Take the corresponding noise filtering mechanism, obtained the clean data sets, this is the basis of further feature extraction. Later, this article introduced the concept of TFIW IDF and time window to analyze the raw data is contained in information time, the noise is filtered data set according to cut into different time window of time sequence data segment. Data for each section through the analysis of the corresponding algorithm to extract the characteristic of emergency sudden word set, and calculate the sudden word corresponding time interval. On the sudden word set the co-occurrence of the contents, time co-occurrence degree calculation, building up sudden word similarity matrix. Build the similarity of torque will be after the input of hierarchical clustering algorithm. Finally, using bottom-up condensing type hierarchy clustering, clustering analysis was carried out on the sudden word set, be composed of the sudden word set binary tree. Use a variety of topics tree segmentation mechanism, to the topic tree segmentation effectively and then get the topic of emergencies. In order to meet the TDT to topic in the definition of the concept, this paper adopt corresponding constraint subject to emergencies and corresponding to the original document flow.In this paper, on the basis of the complete the work emergency topic detection system is established, in this paper, the theory and system to make use of BBS data test achieved good effect.
Keywords/Search Tags:Topic, detection, time-window, emergency, hierarchical, clustering
PDF Full Text Request
Related items