Font Size: a A A

Study On Topic Detection And Tracking On Food Security

Posted on:2013-10-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y SongFull Text:PDF
GTID:2231330371466567Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years, large amount of events on food security are reported. These topics aroused attentions of public and medium. This thesis is focused on the topic detection and tracking algorithms and technology in the field of food security and the main tasks of this thesis are as follows:(1) Proposed a new clustering algorithm C-KMC for topic detection which contains a C process and a K process. In C process the large number of samples is divided into many overlapping smaller Canopies using computationally inexpensive distant measure. It is obvious that, if the Canopies do not destroy the structure of text clusters, they will facilitate the further clustering since they effectively reduce the data set. In the K process X-means algorithm was taken to cluster the texts and detect the topics based on the Canopies generated in the first step. In addition a complementary algorithm which adds the omitted sample to the proper clusters is also designed.(2) Proposed a feature clustering algorithm C-SRFC based on Chi-square relevance which can effectively reduce the size of feature space and compress the time costs in the text classification. Designed 3 feature cluster selection approaches. With these approaches we can further select some stronger feature clusters and remove the weaker ones to further reduce the size of feature space. Designed a CF-IDF weighting approach and compare the performance of this weighting with some other weighting approaches. Experimental results show that the CF-IDF approach is more suitable for the feature space built by the feature clustering algorithm propose in this thesis.(3) Designed and implemented a food security top detection and tracking system in which the text clustering and text classification algorithms proposed in this thesis are integrated. And this system also support the generation of line graphs, histograms and topic display on the map.This system supports the function of text crawling, topic detection and topic tracking. It could acquire the online reports and analyze them to detection the real time hot topic, track them and give reports with text and images to the users. The system can provide the information on food security to the public and the agencies which are in charge of food security affairs.
Keywords/Search Tags:topic detection, topic tracking, text clustering, text classification, Canopy clustering, feature extraction
PDF Full Text Request
Related items