Font Size: a A A

Research And Application On Data Mining Algorithm Based On Rough Set

Posted on:2012-12-12Degree:MasterType:Thesis
Country:ChinaCandidate:G R ZhangFull Text:PDF
GTID:2218330341450519Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As the key link of knowledge discovery, data mining can extract the valuable information from the massive data. And it is a hot field in the current artificial intelligence and information science research. Based on rough set, the data mining is the process of excavating novel and useful knowledge from data by using the rough set theory and method. The decision tree is a method commonly used in data mining. It has a simple structure, works very fast and it is very easy to understand. However, in the process of application, the present decision tree algorithm has many disadvantages. In this paper, further analysis and research will be done on how to optimize the decision tree algorithm by combining rough set and decision tree. The main research work is as follows.In the section of rough set, the algorithm of discernibility matrix reduction will be discussed and the method of simplifying of it will be put forward. Compared with single element structure discernibility matrix, the reduction of decision table is easier to get by using the equivalence class structure discernibility matrix. The bigger amount of knowledge of the attribute shows its distinguish force is more strong. A reduction method of decision table based on attribute distinguish force will be put forward. The method combines discernibility matrix and the attribute distinguish force. Compared with the methods such as the positive region, it can search for an attribute reduction more easily.In the section of decision tree, combined with rough set, the optimized decision tree algorithm based on the distinguish between value will be presented. The method divides the attributes into two fractions according to the difference of the distinguish between value. It can get the candidate attributes of a higher distinguish between value a little bit quickly. The algorithm of decision tree will be optimized from two aspects——knowledge reduction and pruning. The post-pruning optimized algorithm of decision tree based on rough set will be put forward. The method takes advantage of reduction method based on attribute distinguish force to figure out the important nodes. Because it needn't the error rates of the unimportant nodes, so it makes the decision tree algorithm easier and thus raises the efficiency of the construction of a tree.In the end, the paper will compare and assess the optimized decision tree algorithm based on the distinguish between value and the post-pruning optimized algorithm of decision tree based on rough set with the related algorithms. The experiments show the two are both feasible and effective.
Keywords/Search Tags:data mining, rough set, knowledge reduction, decision tree
PDF Full Text Request
Related items