Font Size: a A A

Research On Mining Algorithm Of Frequent Itemsets And Erasable Itemsets Based On Weight

Posted on:2021-02-17Degree:MasterType:Thesis
Country:ChinaCandidate:M M XuFull Text:PDF
GTID:2428330614458332Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
The explosive growth of data leads to the emergence of data mining technology.As an important branch of data mining,association rules can find hidden rules in big data,and they can be widely used in various fields.Initially,association rules are based on the support to mine the frequent itemsets in the database.Considering the different importance of each item in real database,frequent itemsets are extended to weighted frequent itemsets to mine frequent itemsets that users are more interested in.At the same time,the real database is accumulating,every product in the database has different profit values and the importance of each item is different.Therefore,the previous method of frequent itemsets mining has been unable to meet the actual needs,and the research of mining weighted erasable itemsets on the incremental dataset has been paid much attention by scholars.This thesis focuses on the mining algorithm of weighted frequent itemsets and weighted erasable itemsets that should be researched and improved,and it adopts efficient data structure and pruning method to solve the problems encountered in the mining process.The specific contents are as follows:Firstly,in view of the complexity of building tree and the low efficiency of algorithm in mining weighted frequent itemsets,an effective improved algorithm of weighted frequent itemset is proposed.This algorithm constructs a weighted building tree(WBtree)with highly compressed information and stores the node information in weighted building list(WB-list).Besides,the search space of itemsets is traversed in the way of set enumeration tree,and the subsume index is used to reduce the connection times between itemsets.In addition,the equivalence property of superset can speed up the generation of weighted frequent itemsets and improve the mining efficiency of the algorithm.The experimental results show that the improved algorithm has better time and space efficiency on dense databases and sparse databases with higher weighted support.Secondly,according to the characteristics of real data accumulation and different importance of items,an improved algorithm of mining weighted erasable itemset on incremental dataset is proposed.This algorithm adopts the list structure to effectively store the information of the itemsets in the database.In the dynamic incremental database,the weight condition is used to prune the itemsets that does not meet the threshold value,so it can reduce the memory consumption in the process of itemset mining.Besieds,in order to achieve efficient incremental data processing,this algorithm combines the idea of subsume index and difference set to simplify the calculation process of gain.The experimental results show that the algorithm performs well in both dense and sparse databases in terms of running time and memory consumption,and has good scalability in synthetic databases.Finally,this thesis explores the problems faced by weighted frequent itemsets and weighted erasable itemsets,and points out the possible research direction in the future.
Keywords/Search Tags:data mining, association rules, weighted frequent itemsets, weighted erasable itemsets, subsume index
PDF Full Text Request
Related items