Tolerance Granular Computing Model And Its Research On Data Mining

Posted on:2013-03-31

Degree:Doctor

Type:Dissertation

Country:China

Candidate:J Meng

Full Text:PDF

GTID:1228330395499289

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Granular computing is a kind of new idea and new method, which has an outstanding advantage in data mining. It mainly provides a powerful tool for the solution of the massive data mining and complex problems. The classical rough set is limited to equivalence relations of the domain. In practical application, the equivalence relation is too strict to be satisfied, applied and popularized further. If the transitiveness is removed, the equivalence relation degenerates to tolerance relation. The tolerance relation could not get the domain partition but rather covering. Usually, the tolerance rough set theory is based on the incomplete information system. Now the neighborhood is defined to granular and the knowledge is abstracted to derived partition on universe. The derived partition satisfied equivalence relation. They don’t rely on the specific description about the problem. By constructing a tolerance granular computing model, some problems of granular computing are researched, such as granulation, granular computing, classification and clustering. The research works are listed as follows:(1) From the point of view of set theory, the rough approximations based on tolerance relation and neighborhood systems are proposed. Their properties are proved and the accuracy measure is discussed. We find that the one to one corresponding of tolerance knowledge base and tolerance information table is perfect. Pawlak’s complete theorem is extended to general complete theorem. Attribute reduction and rule extraction methods based on neighborhood dependency and center dependency are proposed and their generality is analyzed by the examples.(2) A new symbolic representation method based on granulation is proposed, information granulation is used in time series classification. By segmenting time series and constructing information granules for each segment of time series, compute the similarity of granulation of each segment. Spectral clustering is applied to the formed similarity matrix. Using four time series datasets from UCR Time Series Data Mining Archive, the experimental results show that proposed granulation works successfully for Hidden Markov Model. Comparing with the supervised method and self-training learning method, the semi-supervised method can construct accurate classifiers with very little labeled data available.(3) A method of linear neighborhood propagation based on rough Κ-means clustering is invested. By analyzing the approximate distribution of data, test whether two data lie in the upper approximation or lower approximation; get more information except the distance between data. Using this information to adjust the choice of neighborhoods when the graph is constructed. Experiments with UCI datasets show that comparing with LNP, it is more effectively.

Keywords/Search Tags:

Rough Sets, Data Mining, Incomplete Information System, GranularComputing

PDF Full Text Request

Related items

1	Researched For Processing Approach Of Incomplete Information System Based On Rough Set Theory
2	Research On Incomplete Information System Data Mining
3	The Extended Research And Application Of Rough Sets In Incomplete Information System
4	Study On Incomplete Data Mining Based On Rough Sets And Granular Computing
5	Rough Set Approach To Data Mining In Incomplete Information Systems
6	The Application Research Of Rough Set In Data Mining Of Incomplete Information System
7	Research On Models And Algorithms For Feature Selection On Dynamic Incomplete Data
8	Study Of Data Mining In Incomplete Information Systems Based On Rough Set Theory
9	The Research Of Multigranulation Rough Sets And Knowledge Redution In Incomplete Information System
10	Processing Methods For Incomplete Information Systems Based On Rough Sets