Font Size: a A A

Clustering / Classification Of Theoretical Research In Text Mining Applications

Posted on:2001-11-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:D B BoFull Text:PDF
GTID:1118360185495636Subject:Computer Organization and Architecture
Abstract/Summary:PDF Full Text Request
It's a real challenge for us to make Internet easier to use. The information in Internet is in short of organization, and full of a mass of pages, and on the other side, people want to obtain the information quickly and accurately. The technique of clustering, classification and abstracting based on AI, and so- called "Knowledge Indexing" technique, seemed as good approaches to solve such problems. This thesis aims to discuss the clustering/classification techniques with the background of information retrieval.At first, we summarize the key techniques used to do clustering/classification in different fields such as statistics, machine learning, pattern recognition, etc.We proposed a new classification algorithm based on theorem of "information granularity". We found that clustering corresponds with a special equivalent relation on the sample set, and a series of equivalent relation with different information granularity correspond with a clustering diagram. From the view of granularity, thing is more clear that clustering is a procedure in a uniform granularity, while classification in different granularities.After selecting terms to represent the sample, we can treat the samples as points in the term space, which has the same weight and different coordinate. Let's consider the energy field constructed by the universal gravity, we can obtain a topology structure from the relation among equilibrium curve with different energy. And the topology structure is corresponding with a special clustering diagram. We...
Keywords/Search Tags:Clustering/Classification, Information Granularity, Minimum Description Length, Rules + Exception, Energy field, Topology structure, Latent Concept
PDF Full Text Request
Related items