Font Size: a A A

The Application Of Emergency News Text Clustering Based On Formal Concept Analysis

Posted on:2011-06-14Degree:MasterType:Thesis
Country:ChinaCandidate:X Q FanFull Text:PDF
GTID:2178360305495328Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Text Clustering is one of the most important research branches of Clustering Analyzing. It is an integrated application of clustering method and Natural Language Processing in Test Processing. With the rapid increase of text information in internet news, Text Clustering approach has been used and researched widely. However, due to the complexity of structure, content and diversification of the text type, the traditional methods of Text Clustering show its disadvantages, such as Text Model Representation, Text Feature Selection and so on.This paper makes relatively deep research in field of Emergency News. By using Emergency News Text corpus that we have collected during 2000-2009, we propose an approach using Formal Concept Analysis to express text content for Text Clustering. The main work is as follows:1. We analyze the character of the Emergency News Text, and use Concept Lattice as feature set of Emergency News Text. Based on this, we improve the traditional methods of Text Clustering.2. We modify the traditional term frequency inverse document frequency (TF-IDF) Model. By using the improved algorithm, we enhance the authenticity of the text content and also improve the representation results.3. We improve the method of Similarity Computation in view of the character of the Emergency News Text. Time similarity, place similarity and content similarity are calculated respectively, and combined to represent the text similarity.4. We design and implement the experimental system for Emergency News Text Clustering by using Form Concept Analysis, and verify the method which we have proposed based on the real corpus of Emergency News. In this paper, we compare our algorithm with some traditional methods in three evaluation standards:Precision, Recall, and F value.The experiment results show that our method obtains a better clustering result.
Keywords/Search Tags:Formal Concept Analysis, Concept Lattice, Emergency News Text, Text Feature Set
PDF Full Text Request
Related items