Font Size: a A A

The Research On The Classification Model Based On Rough Set And Entropy

Posted on:2005-03-25Degree:MasterType:Thesis
Country:ChinaCandidate:J YangFull Text:PDF
GTID:2168360122992304Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Knowledge discovery in databases (KDD) is a flourish research field relevant to statistics, artificial intelligence and database system. Data Mining is the process of mining the interesting, potentially useful, valid and understandable knowledge in data. Classification is an important sub-branch of Data Mining, which can find out a model describing a predetermined set of data classes or concepts as used to predict the class label for a test sample.Rough Set theory was proposed by polish mathematician pawlak, which used to represent the uncertain knowledge. Rough Set theory has become a main method for KDD due to its unique advantage in knowledge discovery. Entropy is a concept of information theory, which is abroadly used in data analysis field.In this thesis, a RSE algorithm model based on Rough Set theory and Entropy theory is presented, which contains two components-classification model and prediction model. Classification model is based on typical Rough Set theory and Entropy theory, select the attribute according to entropy theory, determine the equivalence classes according to indiscernible relation, then extract the classification rules. Prediction model is based on the extended rough set model --tolerance rough set theory, predict the class label for a test sample according to the definition of the tolerance relation between a sample and a rule.In addition, we designed a prototype system named R-DM, which based on RSE algorithm model and ID3 algorithm model, which completed the classification and prediction model of the RSE algorithm and ID3 algorithm. On this uniform flatform, we compared the RSE algorithm and ID3 algorithm by using the standard UCI data sets. From the experiment, we can see the RSE algorithm is superior to ID3 algorithm indeed.
Keywords/Search Tags:Data Mining, Rough Set, Tolerance Rough Set, Entropy, Classification, Prediction
PDF Full Text Request
Related items