Clustering / Classification Of Theoretical Research In Text Mining Applications

Posted on:2001-11-04

Degree:Doctor

Type:Dissertation

Country:China

Candidate:D B Bo

Full Text:PDF

GTID:1118360185495636

Subject:Computer Organization and Architecture

Abstract/Summary:

PDF Full Text Request

It's a real challenge for us to make Internet easier to use. The information in Internet is in short of organization, and full of a mass of pages, and on the other side, people want to obtain the information quickly and accurately. The technique of clustering, classification and abstracting based on AI, and so- called "Knowledge Indexing" technique, seemed as good approaches to solve such problems. This thesis aims to discuss the clustering/classification techniques with the background of information retrieval.At first, we summarize the key techniques used to do clustering/classification in different fields such as statistics, machine learning, pattern recognition, etc.We proposed a new classification algorithm based on theorem of "information granularity". We found that clustering corresponds with a special equivalent relation on the sample set, and a series of equivalent relation with different information granularity correspond with a clustering diagram. From the view of granularity, thing is more clear that clustering is a procedure in a uniform granularity, while classification in different granularities.After selecting terms to represent the sample, we can treat the samples as points in the term space, which has the same weight and different coordinate. Let's consider the energy field constructed by the universal gravity, we can obtain a topology structure from the relation among equilibrium curve with different energy. And the topology structure is corresponding with a special clustering diagram. We...

Keywords/Search Tags:

Clustering/Classification, Information Granularity, Minimum Description Length, Rules + Exception, Energy field, Topology structure, Latent Concept

PDF Full Text Request

Related items

1	Structure Analysis Of Large Graph Data Based On Minimum Description Length
2	Research And Application Analysis Of Optimal Reasoning Models Based On Minimum Description Length
3	Statistical Shape Modeling Based On Minimum Description Length Optimization And Segmenting In Medical Images
4	Research On The Key Technologies Of Social Network Structure Partition
5	Research And Application On The Exception Rule Mining Algorithm
6	Research On Extraction Of Linear Structures And Segmentation Of Regions In Images
7	Study On Energy-Saving And Fault-Tolerant Algorithms For Wireless Sensor Networks Based On Topology Control
8	Study On The Unlabeled Text Mining Methods Based On The Concept Lattice Extension Models
9	Community Structure Detecting Of Multiple Granularity And Visualization Based On Internet Network Topology
10	Research On Methods Of Updating Formal Concept And Selecting Granularity Under Covering Multi-granularity