Font Size: a A A

A Frequent Itemsets Mining Algorithm Based On Apriori And FP-TREE

Posted on:2019-08-15Degree:MasterType:Thesis
Country:ChinaCandidate:L M HuangFull Text:PDF
GTID:2428330548491794Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the advancement of information technology,the amount of accumulated data is also increasing rapidly.This leads to a huge amount of data stored in databases,data warehouses,and other repositories.Therefore,data mining began to attract more and more attention,and the analysis of the database to extract useful or previously unknown patterns and rules,called association rule mining.In data mining,association rule mining becomes one of the most important tasks of descriptive technology.It can be defined as finding meaningful patterns from large data sets.Mining frequent itemsets is the basis for mining association rules.Therefore,the research problem in this paper is to study how to quickly mine frequent itemsets.This article first introduces many classic frequent itemsets mining algorithms that have been proposed over the past decades,including horizontal layout-based technology,vertical layout-based technology,and matrix-based layout technology,and propose frequent item set mining for better performance and functionality.The algorithm is ready for theory.However,in order to mine frequent patterns,most current technologiesneed to suffer from repeated database scans,candidate set generation(Apriori algorithm),memory consumption problems(FP-tree algorithm)and more.Just as in the retail industry,many transaction databases contain multiple identical sets of transactions.In order to apply this idea to the shortcomings of the Apriori algorithm and the FP-tree algorithm,we propose a new technique in this paper that combines the current Apriori(Improved Apriori and FP-tree technologies to ensure better performance than the classical apriori algorithm.The new method first uses the improved Apriori algorithm to find the maximum frequent itemsets,and then only considers that the database contains 1 items but not the maximum Frequent items in the frequent items of those transactions are used to prune the database and construct the FP-tree based on the pruned database.It has been proved that the new method in the shopping basket dataset is superior to the Apriori algorithm and FP-tree algorithm both in terms of time and memory consumption.
Keywords/Search Tags:apriori, fp-tree, association rule mining, frequent itemsets, data mining
PDF Full Text Request
Related items