Font Size: a A A

Research On Incremental Mining Algorithm For Incomplete Data

Posted on:2011-11-12Degree:MasterType:Thesis
Country:ChinaCandidate:D D ZhangFull Text:PDF
GTID:2178330332991722Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Many databases are dynamic in real life, which makes the technique of incremental mining become an important issue in the field of data mining. Now a lot of results have been obtained on the study of incremental data mining for complete data. However, related work for incomplete data is just in its beginning phrase, and relevant achievements are very lack. So it is very significant to carry out the research of incremental mining algorithm for incomplete data.In this thesis, the problem of how to realize efficiently knowledge update through getting the best match between incremental data and existing knowledge is deeply studied. The main content includes following aspects:1. Incremental update of attribute core: An algorithm to update core attributes is proposed based on discernibility matrix. Firstly, a concept of approximation accuracy in incomplete data is defined as a measure for attribute significance. Secondly, a definition of core attribute for incomplete data is given according to the significance of condition attributes. Thirdly, an improved discernibility matrix is constructed for incomplete data based on an existing discernibility matrix. Finally, an incremental algorithm for attribute core acquisition is proposed based on the improved discernibility matrix and the definition of attribute core. Both theoretical analysis and an example show that the presented method can update attribute core of incomplete data efficiently.2. Normal form conversion of discernibility function: The process obtaining all reducts of a decision table based on discernibility matrix is also a process transforming a discernibility function from conjunction normal form to disjunction normal form, and the transformation efficiency plays an important role in the performance of attribute reduction algorithm. Through making the best of absorptivity of conjunctive disjunctive operation, based on the mechanism of artificial normal form conversion, an algorithm transforming a discernibility function from conjunction normal form to disjunction normal form is proposed with virtue of queue structure. This algorithm is easy to understand and to realize, and simulation experiments show that it is very efficient to accomplish normal form conversion.3. Incremental update of attribute reduct: two algorithms are proposed. One is based on generalized decision. Firstly, different cases of generalized decision caused by new objects are deeply analyzed. And then, update methods for different cases are given under keeping generalized decision unchanged. Finally, an algorithm for incrementally computing one reduct is proposed by combining different update strategies. The other, obtaining all attribute reducts of a decision table, is based on discernibility matrix. It is presented through analyzing different cases of discernibility matrix and designing update strategies for each case. Theoretical analysis and an example indicate that both algorithms can accomplish attribute reduction efficiently.4 Incremental rules acquisition: Firstly, a conjunctive term table is designed to register conjunctive terms produced by discernibility function and the update of rules is transformed to that of the conjunctive term table. Then, based on the relation with existing rules, new objects are divided into two categories: matched or collided respectively. Finally, conjunctive term table is updated based on the connectability of conjunctive disjunctive operation. According to different categories of new objects, rules are extracted from the updated conjunctive term table.5. Batch incremental acquisition of classification rule: Existing incremental algorithms for rule induction are proposed from the view of new objects. A limitation of these algorithms is that they have to access rule base once for each new object for updating rules. From the view of rule base, a batch incremental algorithm to induce rules is presented. Firstly an equivalence class table is built for all new objects. Then original rule base is efficiently matched with the equivalence class table. Finally the rule base is updated for new objects based on the different match types. The algorithm can deal with both complete and incomplete decision tables and accomplish the rule update through only accessing rule base twice. Theoretical analysis and comparison experiments on UCI datasets demonstrate that our algorithm outperforms traditional algorithms.In this thesis, incremental mining problem in incomplete data is studied systematically. The theory of incremental mining is extended to incomplete data. Several efficient algorithms are proposed and verified by theoretical analysis and experiments. It provides several efficient and practical solutions for incremental mining in incomplete data.
Keywords/Search Tags:incomplete data, rough sets, incremental technique, attribute core, attribute reduct, rule acquisition
PDF Full Text Request
Related items