Font Size: a A A

Study Of Several Knowledge Reduction Algorithms Based On Complete And Incomplete Information Systems

Posted on:2006-03-06Degree:MasterType:Thesis
Country:ChinaCandidate:X WangFull Text:PDF
GTID:2168360152466582Subject:Computer applications
Abstract/Summary:PDF Full Text Request
With dramatic advance and wide application of the database technology, we have been drown in the huge amount of data in rapid increment daily that far outpaced our ability to analyze them still. Data Mining is an effective tool to uncover potentially useful knowledge behind these data. Nevertheless, the large-size and dynamic nature of these data renders it impractical to extract knowledge all over again from scratch whenever new data is inserted: It would be a waste of hardware and software resources. Therefore, a study of incremental Data Mining algorithm is urgently needed. Besides, there are mistakes or missing data in the data set due to various reasons, so, that how to reduce attributes on incomplete information system is a new study aspect.Rough Set theory was put forward firstly by Pawlak in 1982. After 20 years of researching and developing, it has witnessed fruitful achievements in both of theory and applications. Rough Set doesn't depend on any previous information beyond the data set itself. It also has the strong ability in analyzing and dealing with the information, which is imprecise, uncertain and incomplete. To make an intensive research on Rough Set Theory will be of great benefit to extract valuable and easily understood knowledge from large amounts of data more effectively. It will also be of great benefit to popularize and apply Data Mining to commercial systems.In this thesis, we conduct some research on Rough Set models in both complete and incomplete information system. Our main results are as follows:1. Under the complete information system, we have put forward an incremental algorithm on the basis of researching and analyzing some common algorithms for reducing attributes, which obtains the rules on the basis of the concept of distributing reduct. Incremental Data Mining only modifies rule sets when database is updated, which takes advantage of previous calculation result and prevents knowledge extraction from the very beginning.2. Under the incomplete information system, based on researching and analyzing the extended model in Rough Set, we improved a method on layers, and put forward an algorithm to get the rules.3. Combining the above work, we present again an incremental algorithm for attribute reduction in the context of incomplete information system.4. Based on researching and analyzing the technology for data's pretreatment, we have put forward a divinable and automatic clustering algorithm, which can disperse the continuous data.
Keywords/Search Tags:rough set, complete information system, Distributing Reduction, incomplete information system, limited tolerance relation, incremental, clustering
PDF Full Text Request
Related items