Font Size: a A A

Search Of Algorithms For Mining Maximum Frequent Item-sets

Posted on:2008-04-14Degree:MasterType:Thesis
Country:ChinaCandidate:Z H LiFull Text:PDF
GTID:2178360245991773Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Association rule is one of the most important domains of data mining. It is used to discover the interesting relations between items or attributes of database. These relations are unknown and hidden, i.e. it cannot be acquired through traditional logic operations or statistics.Firstly this paper introduces the basic conceptions, classification of data mining and some general thoughts of algorithms about association rule. Emphasis is put on the discussion of mining algorithms of frequent item-sets. Because maximum frequent item-sets embrace all frequent item-sets, the problem of mining frequent item-sets is converted to the problem of mining maximum frequent item-sets. Moreover, some data mining application needs only to discover maximum frequent item-sets, instead of all frequent item-sets. Therefore mining maximum frequent item-sets is very important in data mining.Secondly, this paper studies the algorithm of determining new maximum frequent item-sets in the context of adding new record to the database accidentally, i.e. the problem of Incremental Updating of maximum frequent item-sets. This paper presents an effective Updating algorithm. It uses FP-tree and maximum frequent item-sets that have been mined to discover new maximum frequent item-sets. In processing new work, this algorithm no longer adds new nodes to the FP-tree or support count of any node. Instead it creates new sub tree of root or adds nodes to the new sub tree or adds support count of any node. This algorithm only handles newly increased frequent items instead of frequent items whose support count dose not change. The experiment result shows that this algorithm is more efficient than the DMFIA algorithm based on FP-tree for mining maximum frequent item-sets.Thirdly, A new algorithm for mining maximum frequent item-sets based on FP-array is presented. Many original algorithms for mining maximum frequent item-sets scan database two times, create FP-tree and mine on FP-tree. A new algorithm for mining maximum frequent item-sets based on FP-array is presented. The main concept of this algorithm is to convert a transaction database into a FP-array through scanning the database only once. The FP-array retains all information of items in database. The algorithm then does the mining on the array. FP-array is better in memory because it only stores logic data. Mining upon FP-array need not create conditional array in the mining process. As the mining process adopts logic operation, it has predominance in efficiency. An experiment is done to verify the effectiveness of this algorithm.
Keywords/Search Tags:Data mining, Association rules, Incremental updating algorithm, Frequent pattern array, Maximum frequent item-sets
PDF Full Text Request
Related items