Font Size: a A A

Research On Rough Set Theory In Knowledge Discovery

Posted on:2004-01-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:S H LiuFull Text:PDF
GTID:1118360185496960Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Rough set theory is a new mathematical tool to deal with vagueness and uncertainty. It can analyze the facts hidden in the data without any additional knowledge about the data. Due to its particular advantages, rough set theory has been received more and more attentions from researchers and applied in a variety of areas in recent years. This thesis introduces the basic concepts and present state of rough set theory, proposes some efficient algorithms for rough set methods, and does research on rough set theory's application in knowledge discovery, especially in cluster analysis, text classification and case-based reasoning.The contributions of this thesis are as follows:Research on efficient algorithms for rough set methods: because the inefficiency of the existing algorithms limits rough set theory's more extensive applications in part, it is very important to seek efficient algorithms for rough set methods. This thesis makes an in-depth study on the reasons of the algorithms'inefficiency, focuses on two important concepts: indiscernbility relation and positive region, analyzes the properties of indiscernbility relation, proposes and proves an equivalent method for computing positive region. Thus some efficient basic algorithms are introduced. Then this thesis analyzes the incremental computing of positive region and designs a complete algorithm for the reduction of attributes. The experimental results show that these algorithms are much more efficient than those existing algorithms.Research on rough set-based clustering algorithm: a lot of definitions such as the local indiscernibility relation, the local and total indiscernibility degree between two objects, the indiscernibility degree between two clusters and the integrated approximation rate of the clustering result are given. Based on these definitions, a rough set-based hierarchical clustering algorithm RSHC is proposed. It can automatically adjust the parameter in order to get the more optimum result. The experimental results show that the algorithm is feasible and has good clustering performance especially for symbolic attributes.Research on rough set-based text classification: this thesis analyzes text classification from the view of information granularity and applies rough set theory to feature selection. Then the classical approach of calculating the term weight in vector space model is improved. Based on the results as above, an approach of multi-hierarchy text classification corresponding to multi-level granularity is proposed. In this approach, all classes are...
Keywords/Search Tags:rough set, knowledge discovery, positive region, reduct, generalized approximation space, cluster analysis, information granularity, text classification, vector space model, feature selection, case-based reasoning, case retrieval
PDF Full Text Request
Related items