Font Size: a A A

Based On Rough Set Data Mining Method

Posted on:2007-11-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2208360185464705Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Current information in all fields is explosively increasing with the great development of computer and Internet technology. There is also increasing need for data analysis in order to extract the useful patterns concealed in those data. A data mining technology is rapidly developed. Rough sets theory with its unique advantage plays a critical role as an effective way of data mining. Applying rough sets in data mining field can improve the analyzing and learning ability for incomplete data of large database. Therefore, it is promising for theory and application study.The paper systematically introduces the overall research on data mining, as well as rough set' s theory frame, basis concept and the core of rough sets theory, knowledge reduction. In addition, the relationships between knowledge reduction and dependence, knowledge express system and decision table are discussed, and the meaning of discernibility matrix and its relationship with reduction are also analyzed. The paper researches on several reduction algorithms, analyzes the advantages and disadvantages of each method, in addition, the following algorithms are proposed.Firstly, the concept of association matrix is proposed to improve efficiency. By using the information of attribute frequency, a new algorithm based on dependence of decision attribute on condition attribute is proposed. Experimental results show the algorithm is effective. In particular, it saves more time for large data sets.Secondly, an attribute reduction method based on genetic algorithm is proposed. The core is added to initial population in genetic algorithm in order to improve the performance. The algorithm holds the integration optimization characteristic by taking into account of dependence of decision attribute on condition attribute, and a conservation strategy of optimization is adopted to ensure convergence. Experimental results show the approach is effective.Thirdly, an incremental learning method based on rough sets theory and decision trees techniques is proposed. The method avoids repeated calculations with its incremental updating the rules on the basis of original rule sets according to the relation of the new instances with the existing decision trees, so learning efficiency is increased greatly. Experimental results show the algorithm is effective and has higher recognition rate.
Keywords/Search Tags:data mining, rough sets, association matrix, genetic algorithm, incremental algorithm
PDF Full Text Request
Related items