Font Size: a A A

Integrated Mehtods Of Rough Set And Fuzzy Set For Classification Knowledge Discovery

Posted on:2014-12-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:L WangFull Text:PDF
GTID:1269330401479789Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Establishing classification model from data is a signify branch of knowledge discovery. Uncertainty, inconsistency, and randomicity in actual data are the key problems in classification knowledge discovery. Both rough set and fuzzy set are the generalization of classical set to deal with uncertainty and vagueness in data. Rough set theory is mainly concerned with the discernibility between objects of different class. On the other hand, fuzzy set theory is mainly concerned overlaps between different classes. Fuzzy set determines degree of fuzzy membership according to the experience of expert, while rough set can find the hiding and potential rules from data without any prior information. Rough set and fuzzy set are distinct and complementary, so combining these two theories is efficient for handling actual problem. Integration methods of rough set and fuzzy set for classification knowledge discovery from continuous data are studied in this paper. The main contributions of the work are listed as follows.1. Fuzzy classification model has advantages of clear physical meaning. Decision-theoretic rough set is a powerful tool for processing uncertain and random information. A new fuzzy classification model based on the combination of advantages of these two theories is proposed. Properties of decision-theoretic rough set are studied, and then definition of attribute reduction and corresponding reduction algorithm are studied. Fuzzy c-means clustering is used to transform the continuous attributes to the discretized ones and partition the input space. A heuristic attribute reduction algorithm based on a two-step search strategy deals with the discretized decision table to remove redundant condition attributes. Then concise decision rules are extracted. The rules of fuzzy classification model are got according to the extracted decision rules. The proposed fuzzy classification model is based on data analysis. Fuzzy classification rules of the proposed model have the advantages of clear physical meaning, simplified structure and good generalization ability. Moreover, a study algorithm is no longer needed to optimize the parameters of fuzzy model.2. Lingras rough k means clustering algorithm is analyzed, and then some shortcomings such as it is sensitive to initial centers of the k cluster and outliers and may result in identical clustering and non-convergence are pointed out. An improved rough k means clustering algorithm is presented based on the analysis. The k objects with maximum potentials are chosen as initial centers. The absolute distance between object and the center of cluster is considered to decide whether an object belongs to the lower or upper approximation set of a cluster, so the division of boundary area is appropriate. The improved rough k means clustering algorithm has reasonable initial centers of clustering, so it can get accurate centers. A new fuzzy classification model is developed based on the improved rough k means clustering algorithm. The improved rough k-means clustering algorithm is used to group data set and partition the fuzzy input space. The initial fuzzy classification model is identified according to the clustering results. The premise parameters of the fuzzy model are optimized by genetic algorithm. The proposed fuzzy model with accurate parameters offers the advantages of high precision and good generalization.3. Decision-theoretic rough set model can only handle discretized data, so a fuzzy decision-theoretic rough set model that can deal with continuous data directly is studied. Indiscernibility relation is generalized to fuzzy T-equivalence relations based on Gaussian kernel. Conditional probability is redefined from the view of degree of fuzzy membership. Based on these definitions a fuzzy decision-theoretic rough set model is developed and the properties of it are also discussed. The definition of attribute reduction and corresponding reduction algorithm of fuzzy decision-theoretic rough set model are studied. These studies about fuzzy decision-theoretic rough set model are the theory foundation of its application to classification model.4. A multiple classification ensemble system is designed selectively combining a set of individual classifiers trained with fuzzy decision-theoretic rough set model based reducts. A two-step random search strategy for attribute reduction based on fuzzy decision-theoretic rough set model is applied to raw data to compute a set of reducts. A set of individual classifiers is trained based on a set of reducts. Genetic algorithm is used to select part of the trained individual classifier to be combined. Finally, classification result of ensemble system can be completed by combination with vote rule. The information in different reducts is distinct and complement, so individual classifiers trained with reducts are diverse. The experiments results show that the proposed ensemble system has good classification performance with fewer individual classifiers.
Keywords/Search Tags:rough set, fuzzy set, attribute reduction, continuous data, classification
PDF Full Text Request
Related items