Font Size: a A A

Rearch And Application On Algorithm Of Association Rule In Data Mining

Posted on:2007-03-10Degree:MasterType:Thesis
Country:ChinaCandidate:W F GaoFull Text:PDF
GTID:2178360182480429Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Data mining is to reveal the implicated but useful information from massive, incomplete, noise, fuzzy dataset. Its essential target is to extract valuable pattern from the large-scale database. Association rule mining is an important branch of data mining that has obtained many valuable results but there still are a deal of more challenging problems to discuss.The task of mining association rules consists of two main steps. The first involves finding the set of all frequent itemsets. The second step involves testing and generating all high confidence rules among itemsets. For the both step, computable complexity is the bottleneck of the algorithm for the number of frequent itemsets increases with the number of items exponentially.This paper describe an algorithm called Partition that is fundamentally differentfrom all the previous algorithms in that it scans the database at most two times to generate all significant association rules. The algorithm executes in two phases. In the first phase, the Partition algorithm logically divides the database into a number of non-overlapping partitions.The partitions are considered one at a time and all large itemsets for that partition are generated. At the end of phase I, these large itemsets are merged to generate a set of all potential large itemsets. In phase II, scan the database secondly, apply apriori algorithm to these itemsets and actual supports for them are generated and the frequent itemsets are identified. The partition sizes are chosen such that each partition can be accommodated in the main memory so that the partitions are read only once each time. Both the theoretical analysis and the experimental comparison show that the algorithm proposed in this thesis has more improved performance than Apriori algorithm.This article apply the research results to the medical service information system, has constructed a season epidemic disease relational model, to discover the relationsbetween the epidemic disease the people will suffer form possibly with the season, and further discussed the hidden > possibe relations among the different diseases, this will provide the reference for the people to disease preventing and controlling.Finally, on the basis of summarizing the whole paper, prospect the research and development of data mining and apriori algorithm.
Keywords/Search Tags:Data Mining, Association Rule, Apriori, Partition, Relational model
PDF Full Text Request
Related items