Font Size: a A A

Research On Rough Set Theory Based Data Mining Algorithm

Posted on:2007-04-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:J ZhangFull Text:PDF
GTID:1118360212467704Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The theoretic study of this paper comes from the 2002CB312000 program of National 973 Fundamental Research Program and the 60473077 program of National Nature Science Foundation.In recent years, in contrast to the rapid development of our national information construction, the technology of automatic acquisition of knowledge has become its bottleneck. As a technology aims to find out the way of abstracting automatically and intelligently valuable information or knowledge from a huge amount of data, data mining is an active research field of AI researching. As an efficient mathematic tool to deal with the vagueness and uncertainty, rough set theory provides a new approach of data mining. In this thesis, we research some problems of data mining based on rough set theory. The research topic mainly includes feature selection based on rough set theory and continuous feature discretization based on rough set theory. Main topic includes is:1) We propose a heuristic feature reduct algorithm based on feature appearing frequency. To finding out minimal reduct, this algorithm employs appearing frequency of condition feature in discernibility matrix as heuristic information, and employs length of shortest discernibility matrix entry which contains corresponding condition feature as secondary heuristic information. Experiment shows that in most situations the proposed algorithm can find optimal redut.2) We analyses the relationship of condition feature in reduct with decision feature and relationship of condition feature with other features in reduct. According concept of feature dependency based on rough set theory, we define feature correlation. Based on this definition, we propose a feature reduct algorithm based on feature correlation. The empirical study shows that the proposed approach is efficient and effective in removing redundant and irrelevant features.3) We propose a new approach to decide candidate cut point set of decision...
Keywords/Search Tags:Data Mining, Rough Set Theory, Feature Selection, Reduct, Feature Correlation Index, Feature Dependency, Continuous Feature Discretization
PDF Full Text Request
Related items