Font Size: a A A

Research And Application Of A New Algorithm Based On Several Algorithms Of Mining Association Rules

Posted on:2010-09-14Degree:MasterType:Thesis
Country:ChinaCandidate:Z TianFull Text:PDF
GTID:2178360272996231Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With improved science and technology, computer technology and computer net technology developed sharply, informationization degree of human' s kinds of production and life fields. Especially using numeral equipments widely, the ability that people produce and search data using computer technology . So the word "information blast" used to describe the situation is appropriate.We have entered information age. in recently decades, the quantity of data in the database augments sharply. The scale of data is GB. So excessive data is the main problem human face . Human is in the situation that data is large and knowledge is few. The precious data that people produce and live shouldn' t only record history. The huge and seemingly unrelated data maybe hide very meaning rule and knowledge for people to make a decision. We want to not only owe a sea of knowledge merely. Developing useful knowledge from data and improving utilization rate is our main purpose. Data mining uses many methods and technologies,eg: database, system based on rules, artificial intelligence, knowledge expressing, machine learning, visual technology.Need is the most powerful motivity that human develop and technology improve. Data mining (DM) comes into being under this background. It more and more displays extensive application foreground and energetic vitality.Mining association rule is used to discover association relationship between many items in the database. Along with data increased and stored, many users become interested in mining association rule from database. The interesting association ralationship discovered from a large amount of commercial transaction can be attribute to establish commercial decision sucnas commercial design and intersection shopping.Mining association rule has been an important research direction in the field of data mining. Association rule pattern is a kind of discriptive pattern. The algorithm of discovering mining associations is the part of learning method without supervision.Agrawal and other persons suggested problem about mining association rules between item sets in transaction database of custers. Many researchers have done much reseach about assiociation rule.The second step is based on the first one. Its workload is very little. the entire performance depends on the first step.The criterion for evaluating association rules is mainly support and confidence. Generally speaking, support measures the importance of association rules. Confidence measures accuracy of association rules.This paper introduces generation and development of data mining. After that, the paper introduces basic concepts of data mining, object, task, procedure and future development direction of data mining. Research the theory and some classic algorithms such as Apriori, LIG and FP-growth deeply.Through researching related some classic algorithms of mining association rules deeply, bring forward a new algorithm of mining association rules, IFP-growth algorithm. The measures of using a new data structure, improving method of storing data and using compressed prefix tree improve the performance of the algorithm.The detailed research content is following: Improve the method of storing transactions and compress transaction, reduce the times of transaction traversal, reduce workload. This algoritom is different from FP-growth algorithm. The IFP-growth uses compressed prefix tree to reduce the amount of transaction stored. Design data structure IG (item group). Item sets based on transactions devide transaction again. The space representing transactions in the memory descends obviously. The effect of algorithm is better.At last, realize the improved algorithm of the paper. Experiment show that the efficiency of the improved algorithm has improved obviously.Arrangements for next research work: Research operation mechanism of the IFP-growth under parallel condition. Data mining needs a large amount of computation. Using parallel computing instead of serial computing is the main method to improve operation efficiency. Research performance when IFP-growth processes more kinds of data sets (not only the scale of data, but also type of data). Research performance that IFP-growth mines association rules of other types.
Keywords/Search Tags:data mining, association rule, frequent item set, IFP-growth
PDF Full Text Request
Related items