Font Size: a A A

The Research On Combination Of Multi-classifiers Based On Rough Set Theory

Posted on:2013-10-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y L YuanFull Text:PDF
GTID:2248330377951063Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of computer technology and communication technology, plenty of data is produced. Data mining has been proposed for finding the implicit rules in data. Now, the main techniques of data mining include decision trees, neural networks, regression analysis, genetic algorithms, rough set, clustering, and so on. Among them, rough set theory based on the classification ability of data could solve the knowledge acquisition problem in potential, uncertain or ambiguous data in the lack of priori knowledge. It has been applied in the fields of pattern recognition, feature selection, and fault diagnosis.The traditional classification method is difficult to classify all data sample, because it is a single classifier. Integrated theory of multi-classifiers is an effective method to solve the problem of single classifier. As an important topic of machine learning, integrated theory of multi-classifiers have not attracted widely attention in the application of rough set.In this paper, rough set theory will be bring into integrated theory of multi-classifiers, the method of integrated study based on rough set theory is discussed. The main contents are as follows:1、Researching the constructional method of base classifier. Based on different algorithms (rough set theory, C4.5algorithm and NB algorithm), this paper constitutes different types of base classifier including rough set classifier, Bayesian classifier and decision tree classifier by random training data set. 2、Researching the differences of different base classifiers. According to the integrated theory of multi-classifiers, the more differences base classifiers have, the better effect of classification there will be. This paper reflects the differences of base classifiers in two aspects:one is the base classifiers are being randomly generated according to the training data set, the other is the base classifiers are being generated by three different types of algorithm.3、Bringing up the integrated strategy and method of multi-classifiers. In order to get the best integrated effect, this paper generates many different types of classifier by training data set. And training data set is divided into multiple subsets based on the decision attribute values. Then, test data set is divided into multiple cluster subsets by the K-means method. Find the corresponding relationship between the training data subsets and cluster subsets through Euclidian Distance. Finally, the best classifier is selected on the training data subset to classify the corresponding cluster subset. In order to demonstrate the effectiveness of this method, this method is applied to a large number of UCI data sets to get the better classification result.
Keywords/Search Tags:rough sets, integrated learning, classify, combination ofmulti-classifiers
PDF Full Text Request
Related items