Font Size: a A A

Research On Correlative Algorithms Of Association Rule Mining

Posted on:2010-05-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y H MaoFull Text:PDF
GTID:2178360278459425Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Data mining technology has been a hot task in the field of database and artifieial intelligence in recent years, and attracted extensive attentions from seience and industry. Nowadays, in the prosperous background of data mining technology, association rules technology obtains the vigorous development.Association rule mining is an important sub-branch of the data mining, which task is to discover association rules satisfying both a minimum support threshold and minimum confidence threshold. Association rule mining has become a hot research topic in recent years, and has been used widely in marketing and transaction analysis. Association rule mining algorithms are the core contents in the area. So far, many famous algorithms of association rules have been proposed.This thesis has systemically analyzed and researched association rule mining, and puts forward two improved algorithms on the basis of existing research work. The main tasks in the thesis are as follows:(1) The basic theories for association rule mining are discussed systematically in this thesis, including the basic concepts, classification, mining process, the value of measurement, and on this basis an analysis of the expansion of the issue of association rules, pointing out that the association rule mining research.(2) After the classic algorithm of association rules, i.e. Apriori, is analyzed throughly, the key mining steps and lacks of the algorithm are pointed out. Then, the correlative studies of the binary algorithm and the matrix algorithm are analyzed and summarized, and a new mining algorithm for associattion rules based on frequent itemsets matrix is proposed. The new algorithm does not need to generate candidate frequent item sets, and scans database only once, so the storage space it takes is much less by storing and reducing the transaction data with the frequent support matrix. The new algorithm is proved superior in the performance through theoretical analysis and experimental tests.(3) Through studying and analyzing the frequent pattern-growth (FP-growth) algorithm, and combining the characteristic of user access pattern, another new algorithm, named AP-mining, is put forward, which is used to mine user access pattern. The AP-mining algorithm maps the Web user's session set to an access pattern tree, then mines the frequent access pattern and generates the pattern array on the access pattern tree, and finally, finds all frequent access patterns in the pattern array. In fact, the mined pattern knowledge can serve to design and maintain Web sites better.
Keywords/Search Tags:Data mining, Association rules, Apriori algorithm, Frequent item sets matrix, FP-growth algorithm, Frequent user access patterns
PDF Full Text Request
Related items