Font Size: a A A

Data Reduction Algorithm Based On Information Entropy

Posted on:2009-06-23Degree:MasterType:Thesis
Country:ChinaCandidate:S AnFull Text:PDF
GTID:2178360308478688Subject:Operational Research and Cybernetics
Abstract/Summary:PDF Full Text Request
Along with the quick development of database technic and widely application of database management system, there are more and more databases that people have accumulated. There is much important information behind great quantities data. Data mining is extracting or mining information that is implied, unknown, latent and useful.Rough set theory is a new math tool to deal with fuzzy and uncertain knowledge. The trait of rough set is that rough set do not need any transcendent knowledge or append information. Rough set could deal with imprecise,half-baked,disaccord information, from which it could found implied knowledge and open out latent rules. Rough set is a new tool of data mining. Attribute reduction is core of rough set theory.This article takes entropy as heuristic information. Two new attribute reduction algorithms are obtained. The first one takes entropy as heuristic information, and lists all the results when the objects could reach a decision directly, and then analyses the other objects. So the algorithm could reduce needless repeat. It should pay attention to that attributes need not repeat which are on the same branch, which is a good way of pruning. The second one starts searching based on empty set, and adopts backdate algorithm. The algorithm could reduce searching space and enhance algorithm efficiency. The two algorithms could reduce searching times and space. At last, the efficiency of the two algorithms is proven by illustrations.Classical rough set theory is used to deal with discrete values. Entropy is also used to deal with information system with discrete values. But the real data are not only discrete values but also real values. At last, the article introduces basic theory of fuzzy entropy and evaluation model. It puts forward an algorithm based on fuzzy entropy applied in evaluation model. The validity of the algorithm is proven with an example.
Keywords/Search Tags:rough set, data mining, discernible matrix, attribute reduction, information entropy, fuzzy entropy
PDF Full Text Request
Related items