Font Size: a A A

Research On Algorithm Of Mining Association Rules Based On Matrix

Posted on:2008-07-04Degree:MasterType:Thesis
Country:ChinaCandidate:J LiFull Text:PDF
GTID:2178360215966033Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Data Mining is a technique that aims to analyze and understand large source data and reveal knowledge hidden in the data. It has been viewed as an important evolution for information processing. Association rule is one of the important research branches of data mining, is used to describe the potential correlations in large quantity of data, and has the most significant application in future.Since Rakesh Agrawal et al. proposed the subject of mining association rules in 1993, researchers have put forward lots of algorithms, such as Apriori, FP-growth, and variety of improved algorithm based on these two algorithms. Most of these algorithms treat each item as uniformity. However, it is not true in the real world databases, in which different items usually have different importances. A natural idea is to give each item different weights to distinguish their importance. So, the algorithm of mining weighted association rules is studied in this thesis. In addition, in the actual excavation process of association rules, users often need to modify the value of minimum support and minimum confidence to find real interested rules; and the data in the database is constantly being added, modified or deleted, this is a dynamic interactive process. Therefore, the update of association rule is worthy to study.In the beginning of this thesis some basic principal theories, approaches and problems of data mining are introduced, followed by conceptions, categories and general thoughts of popular algorithms about association rule. A few classic association rule and weighted association rule extracting algorithms are deeply discussed, and analyzed the problems existing in the algorithms. On this basis, a new weighted association rules model and an effective algorithm AMB to handle the problem of mining weighted frequent itemsets are proposed in this thesis. The novel algorithm based on matrix for finding frequent itemsets. It only needs to scan the transaction database once to convert it into 0-1 matrix and let bit strings do logical "and" operation to judge whether a particular itemset is frequent. It can get rid of scanning the original transaction database repeatedly. Theoretical analysis and experimental results indicate that the new AMB algorithm is more effective and efficient. Based on the analysis of existed incremental update algorithm, an improved incremental update algorithm MFUP is proposed. Experiment is carried out to confirm the efficiency of the algorithm MFUP. In the last part of the thesis, the conclusion and prospect of research direction is given.
Keywords/Search Tags:Data Mining, Association Rule, Weighted Association Rule, Frequent Itemset, Incremental Update
PDF Full Text Request
Related items