Font Size: a A A

Design And Optimization Of Classification System Based On Genetic Fuzzy Theory

Posted on:2010-02-12Degree:MasterType:Thesis
Country:ChinaCandidate:F WangFull Text:PDF
GTID:2178330338475911Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The database contains massive information which can be used for business decision and scientific judgment. Constructing an accurate and efficient classification system from a large-scale database has been a key task for data mining and machine learning research. Fuzzy logic is a useful tool for data mining. Since fuzzy logic can be used for imprecise knowledge processing and imprecise reasoning, applying fuzzy logic for data mining which leads to resultant classification system has becoming recent research hotspot. In addition, the lack of learning capabilities in the fuzzy reasoning generated a certain interest for the study of fuzzy systems with added learning capabilities. Integrate evolutionary algorithm with the framework of soft computing and fuzzy system to produce genetic fuzzy system (GFS) has been introduced and widely used. GFS has been demonstrated its self-learning, adaptive and optimizing ability in system. In this paper, a novel method that applying GFS in data mining is introduced, and the main research of this thesis is described as follows:Firstly, In order to solve the problem of fuzzy partition of the original data base, a fuzzy cluster algorithm based on competitive agglomeration is introduced. The CA algorithm can effectively solve the problem of the traditional Fuzzy c-means which cannot predict the best partition of the given data base, achieving to partition quantitative attributes from each data record into several optimized fuzzy sets to the fact of different structure and attribute. After each quantitative attribute is partitioned into several optimized fuzzy sets by CA algorithm, these fuzzy sets are usually represented as fuzzy variables for classification. In the experiment, we apply the FCM and CA separately to the given data base. The results of experiments shows that adopting CA is much better to reflect the difference distribution of the data and the partition results are also much more reasonable.Secondly, in order to get the fuzzy association rules from the new data base which is constructed by fuzzy partition, a improved algorithm based on Apriori algorithm according to the features and relative definition is proposed, and the fuzzy association rules are mined, which can be used to construct the rule base of the fuzzy classification system. The experimental results show that using modified algorithm can mine the interesting fuzzy association rules with at least a minimum support and a minimum confidence respectively, and the effectiveness of this algorithm is verified.Thirdly, two main factors related to FCS are accuracy and interpretability. It is unavoidable to mine some redundancy rules, which will significantly influence FCS's performance. Thus, some relevant technology described in GFS including Genetic tuning of DB and genetic learning of RB is utilized to optimize the FCS and achieve a trade-off between accuracy and interpretability. Experimental results show that executing the proposed approach can not only simplify the RB, but also enhance the classification accuracy significantly.Finally, simulation results applied to an existent diabetes dataset demonstrate the performance of the proposed approaches are better than those of other popular classification methods.
Keywords/Search Tags:Genetic fuzzy, Fuzzy classification system, Fuzzy c-means algorithm, Competitive agglomeration algorithm, Fuzzy association rules, Apriori algorithm
PDF Full Text Request
Related items