Font Size: a A A

Research On Mining Classification Rules From Multiple Data Sources Based On Concept Lattice

Posted on:2007-10-15Degree:MasterType:Thesis
Country:ChinaCandidate:H ChenFull Text:PDF
GTID:2178360182986601Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Parallel and distributed data mining is becoming a new reaseach point in allusion to the expanding of data and their new characteristic such as high-dimensionality, heterogeneity and distributed storage. Mining classification rules is one of the important embranchments in data mining field. It is the key point to do research on mining classification rules from multiple data sources in this dissertation. Concept lattice is used here as model to describe classification, because it is an efficient tool to discovery knowledge and it has self-contained structure and mature theory.The major work of the dissertation is as follows:1. The method of mining classification rules from multiple data sources based on concept lattice is researched. Taking into account the intelligibility of the outcome and the concrete scheme, two ways of expressing the obtained knowledge respectively based on model and rules are analyzed here, and the form of rules which is intuitionistic to express classification is chosen. The method of knowledge combination is adopted here, and the method and relative algorithms of direct amalgamating and synchronous amalgamating of classification rules are introduced, analyzed and compared. Then the method of synchronous amalgamating is chosen to implement the mining of classification rules from multiple data sources.2. The phenomenon and the cause of overfitting referred in mining classification rules from large-scale datum are analyzed. On the basis of the analysis, the pruning strategy is adopted, two methods respectively named pre-pruning and post-pruning are analyzed, the post-pruning is chosen to prune the concept lattice, and the effect of pruning is compared and analyzed by experimentation, which indicates the validity of the method.3. Based on the work stated above, an experimental system named DMM_CLASS is given here to implement mining classification rules from multiple data sources based on concept lattice.
Keywords/Search Tags:Classification Rule, Distributed Data Mining, Concept Lattice, Overfitting, Pruning
PDF Full Text Request
Related items