Font Size: a A A

Research On Data Mining Algorithm Based On Rough Set Theory

Posted on:2006-07-19Degree:MasterType:Thesis
Country:ChinaCandidate:W H CengFull Text:PDF
GTID:2168360155969651Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Now, we are stepping in an era of net information. With the high-speed development of computer technology and net technology, the information in different fields has increased extremely. How to pick up the potential, valuable and compact knowledge from the vast and disordered data has become a desiderative problem? The technologies of data mining(DM) and knowledge database discovery(KDD) have emerged with such needs.The rough set theory is used as an approach of DM or KDD. The distinct difference of the rough set approach compared with other approaches of dealing with the uncertain problem such as the data mining approach based on probability, the data mining approach based on fuzzy theory and the data mining approach based on proof theory is that the method need not provide any known knowledge which is out of the approaching data set domains. The rough set method is worked as an stronger auxiliary function with the other methods of dealing with uncertain problem, especially with the approach of the fuzzy theory.However, the basic calculation of rough set theory is based on the calculation of intersection, union and supplementary and even calculating the equipollence. The simplest rule or the whole rule applied rough set theory to mine the generally decision-making table is an issue of NP-Hard. NP-Hard issue is a general puzzle of calculating math. Since its solution relies on the solution of certain NP-Hard issue, the degree of time complexity of algorithm has extremely restricted the application of rough set theory.The basic concepts and extended theory model of rough set theory are introduced in the paper. The algorithm attributes used rough set theory to mine data are also studied in the paper. Through proving a denotable theorem of semigroup- finite set algebraical system, we can described the finite set algebraical system which can be described by an isostructural bit vector algebraical system, we can also change the set operation of intersection, union and supplementary into the and-or- not operation of bit vector algebraical system. The theorem is universally used, that is, it is the same with most of the rough set theory algorithms.Used this theorem, A date mining attribute about reduced unitized algorithm and rule mining algorithm have be designed based on rough set theory. Compared with the general algorithm, the degree of time complexity has been reduced greatly, at the same time, the degree of space complexity has also been reduced to the standard ofone eighth of general algorithm. An prototype system of RSDM and a software kit of MATLAB have been developed applied the algorithm for the purpose of further research and application of rough mining.On the other hand, the problem of the degree of time complexity which exists in the research of rough mining algorithm hasn't been solved completely, therefore, the paper also introduces the parallel calculating model based on the transition of information. On the basis of the model, the parallel calculating algorithm of rough mining which is applied the MPICH software kit of parallel calculation has been researched principally.The main work of the paper has been done as follows:1. A denotable theorem of semigroup- finite set algebraical system has been given and proved. A serially realizing algorithm of rough mining has been designed.2 The discovering process of general knowledge has been introduced. The rough mining process has also been studied. An prototype system of RSDM and a rough mining software kit of MATLAB have been designed and developed.3. The parallel calculating model which is based on the transition of information has been introduced. The paper also researches on the parallel calculating algorithm of rough mining principally.
Keywords/Search Tags:Rough Set, DataMining, MPI, KDD
PDF Full Text Request
Related items