Font Size: a A A

The Research Of Mining Association Rule Algorithms

Posted on:2006-06-09Degree:MasterType:Thesis
Country:ChinaCandidate:T HanFull Text:PDF
GTID:2168360155470061Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Data mining is the process of discovering interesting knowledge from large volumes of data which are stored either in databases, data warehouses, or other information repositories. It includes lots of technical measures such as association rule mining, prediction, classification, clustering and evolutionary analysis. Of these techniques, the association rule mining technique is the most important and also the most widely-used method.The concept of association rule was first proposed in 1993 by Dr. Rakesh Agrawal who was working at IBM, to describe the relationship between transactional items in transaction databases, i.e. the frequent relationship. Studies on this subject have been carried out for more than 10 years and have yielded many fruits, many problems are still existing which need urgent resolution. This paper gives a detailed introduction to the studies of this area and makes an in-depth exploration on the association rule mining theory, with the algorithms of association rule mining in particular, the study turns out some valuable results.The paper first studies some typical association rule mining algorithms such as Apriori, AprioriTid, AprioriHybrid, Apriori RFM, Partition Algorithms and Sampling Algorithms. Considering the defects of these typical algorithms, a new algorithm AprioriTidHybrid, which could extract association rule faster, is then introduced. AprioriTidHybrid, basing on the typical algorithms of Apriori and AprioriTid, uses Apriori at its initial phases, later changes to AprioriTid when searching; also gets (C2|ˉ) from L2 instead of C2 by considering (C2|ˉ) may be larger than the original databases, this improved algorithm significantly reduces the scales of (C2|ˉ); and gets candidate itemsets by efficient DAgen instead of Apriorigen. The experimental results show the new algorithm outperforms Apriori and AprioriTid.The innovation of the sampling association rule algorithm which is grounded on Apriori and AprioriTid is: (1) a new more efficient algorithm FASTA is proposed; (2) by adopting the typical FAST algorithm to choose samples, the chosen samples are more typical and accurate; (3)AprioriTidHybrid is employed to extract the chosen samples. The experimental results show that FASTA outperforms other algorithms in the performance.
Keywords/Search Tags:data mining, association rule, AprioriTidHybrid, AprioriRFM, FAST, FASTA
PDF Full Text Request
Related items