Font Size: a A A

Research And Improvement Of Apriori Algorithm

Posted on:2020-06-08Degree:MasterType:Thesis
Country:ChinaCandidate:S C HuFull Text:PDF
GTID:2438330590962236Subject:Software engineering
Abstract/Summary:PDF Full Text Request
This paper systematically studies the Apriori algorithm and the FP-growth algorithm.A Node-Apriori(Node based Apriori)algorithm is proposed based on the Apriori algorithm,which requires multiple scans of the database.Because of frequent IO operations when mining frequent itemsets,the efficiency of the Apriori algorithm is not efficient.The Node-Apriori algorithm improves the execution efficiency of the Apriori algorithm in two aspects.One is to encode the item set and the transaction record by means of binary encoding,which effectively reduces the occupation of the item set and the transaction record memory,and improve the execution efficiency by using the operation between the equivalent binary numbers instead of the operation between the sets.The second is to organize the candidate set by means of nodes,which can reduce the number of transaction records that counting the support count candidate itemsets needs to traverse.However,the Node-Apriori algorithm still has disadvantages.Because it uses a breadth-first search method to search for frequent itemsets,this will result in more parallel nodes when mining frequent k-item sets,and the transaction records of each of the same-level nodes contained are likely to be duplicated,so the algorithm has a certain demand for memory usage.Aiming at improving the execution efficiency of the algorithm furtherly,a Tree-Apriori(Tree based Apriori)algorithm is proposed.This algorithm uses the Trie to organize frequent transaction records.Each node represents a subtree.By merging and updating the nodes in the tree,you can complete the mining of frequent itemsets.The experimental results show that the execution efficiency of the algorithm Tree-Apriori is much better than that of the Apriori algorithm and the Node-Apriori algorithm.This article found the connection between Apriori and FP-growth.Tree-Apriori is an implementation of Node-Apriori,but it is far superior to the Node-Apriori algorithm in both time and space.Moreover,the Tree-Apriori algorithm is similar to the overall implementation framework of the FP-growth algorithm,which means that the Apriori algorithm and the FP-growth algorithm are fundamentally related.Due to the frequent occurrence of food safety problems,we choose Xi'an food sampling data to explore the association rules.I hope to tap into the core part of food safety issues and solve them before food safety problems occur,thus improving people's confidence in purchasing food in their daily lives.After a series of steps in data mining,many association rules were discovered for the subsequent spot checks.
Keywords/Search Tags:Binary Encoding, FP-Tree, Apriori, FP-growth
PDF Full Text Request
Related items