Font Size: a A A

Application And Optimization Of Association Analysis Algorithm Based On Hash Table

Posted on:2020-03-09Degree:MasterType:Thesis
Country:ChinaCandidate:J B LiFull Text:PDF
GTID:2428330590957745Subject:Operational Research and Cybernetics
Abstract/Summary:PDF Full Text Request
Association analysis is a practical technique in data mining that can mine interesting rules or connections between itemsets.However,with the rapid development of modern society,the amount of data accumulated by the data is also huge,so in addition to the higher requirements on hardware such as computers,the corresponding algorithms should also be improved.The disadvantage of the classic Apriori algorithm is that the candidate sets the number of rules increase exponentially as factors such as the number of transactions increase and the maximum width of the transaction increase,its computational complexity and time complexity are also increasing dramatically.And when the support count of each candidate k-item set is calculated,the data of the entire database is retrieved,so its calculation efficiency is low.The improvement of the Apriori algorithm based on the Hash table is that the entire database is retrieved twice first.The first time is to establish the transaction weight table to reduce the number of transactions,and the second is to establish the Hash table of the item set.After that,the search needs to retrieve the Hash.The position corresponding to the table can be combined with the weight of the transaction to obtain the support count of the candidate set,which narrows the search range and greatly improves the efficiency.Through experimental comparison and analysis,it can be seen that when the support threshold is small,the Apriori algorithm based on the hash table has an absolute advantage.As the support threshold increases,the running time of both decreases and approaches.
Keywords/Search Tags:Apriori Algorithm, Minimum support, Frequent itemset, Hash table
PDF Full Text Request
Related items