Font Size: a A A

Research On Decision Tree Based On Rough Set Theory In Classification

Posted on:2011-07-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y F QinFull Text:PDF
GTID:2178330338991244Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Decision tree is widely used in classification. Classification rules which are represented by decision tree are deducted through a group of disordered and irregular instants. We may get valuable,potential information. In the paper, data sets and decision tree algorithm, which are processed and improved, are used to improve prediction accuracy of classification and reduce time complexity.Firstly, by comparing with the discretization algorithm of continuous variables, an algorithm SISA for splitting sequence interval is presented. Different intervals are divided when values of the same decision attribute values are the same and condition attribute values are different. Then candidate cut points are inserted into the intervals. The values in the intervals are expressed by discretization values. A case analysis shows the algorithm is easy to realize.Secondly, in order to remove redundant attribution in data set, an attribute reduction algorithm SDMAR based on simplified discernibility matrix is proposed. Firstly, the dataset is simplified before reducing attribute, then a simplified decision table is got. Secondly, simplified discernibility matrix is constructed according to reduction decision table. To achieve the purpose of attribute reduction, the attributes occurred most frequent in discernibility matrix are found. The analysis of the algorithm and a case shows the time complexity of attribute reduction is low.Lastly, we propose ICEDT, an improved decision tree construction based on co-evolution. Binary code is introduced in ICEDT algorithm, crossover and mutation operations are easily implemented, and a new computational method of fitness is given. The coded data set is divided into different subsets by encoding features. The co-evolution method is used to each subset until a satisfactory decision tree is found.We implement the above algorithms with C. Experimental results show that the algorithms proposed in this paper consume less time and high prediction accuracy than the current ones, and the anticipated results are realized.
Keywords/Search Tags:Decision tree, Discretization, Attribute reduction, Discernibility matrices, Co-evolution, Binary code
PDF Full Text Request
Related items