Font Size: a A A

Study Of Rough Set Theory Based Incremental Algorithms And Its Application

Posted on:2004-06-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y H ChenFull Text:PDF
GTID:2168360092475056Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development and application of database technology, large quantities of data have been produced and are being in constant increment daily in every social department. Data Mining is an effective tool to uncover potentially useful knowledge behind those data. Nevertheless, the large-size and dynamic nature of those data renders it impractical to extract knowledge all over again from scratch whenever new data is inserted: It would be a waste of hardware and software resources. Therefore, a study of incremental Data Mining algorithm is urgently needed. Incremental Data Mining only modifies rule sets when database is updated, which takes advantage of previous calculation result and prevents knowledge extraction from the very beginning.Rough Set Theory is a Data Mining approach capable of dealing with incomplete and inaccurate data, and it has been successfully applied to Artificial Intelligence and Knowledge Discovery, Pattern Recognition and Classification, and Fault Detection. However, most of current Rough Set Theory based algorithms take static datasets as its source regardless of the dynamic feature of datasets in current information systems. Hence, we consider it necessary to take a study of Rough Set Theory based incremental algorithms.First, we introduce two typical Rough Set Theory based incremental algorithms, analyze their advantages and disadvantages, and point out their theoretical faults. Then, after the introduction and analysis of algorithm ASRAI we show a counterexample of this algorithm and present a revised algorithm CHEN1 to eliminate the disadvantage of algorithm ASRAI. Empirical results show that our revision works well. Finally, we introduces algorithm Shan. Our further analysis proves that algorithm Shan is equivalent to algorithm ASRAI. Through the study of algorithm ASRAI we find that when a certain sort of new records is inserted, algorithm ASRAI will engender false results. This enforces a re-calculation of rules from scratch since it can't be eliminated. Above analysis indicates that algorithm CHEN1 can only properly handle a portion of counterexamples of algorithm ASRAI. Therefore, we present another incremental algorithm CHEN2. Empirical results show that algorithm CHEN2 works quite well. Additionally, we implement all the algorithms presented in this paper. The software system we developed also contains several algorithms in Rough Set Theory and the empirical result of algorithm CHEN2.
Keywords/Search Tags:Rough Set, Incremental Learning, Incremental Algorithm, Discernibility Matrix, Data Mining
PDF Full Text Request
Related items