Font Size: a A A

Research On Frequent Itemsets Mining Algorithm Based On Data Stream

Posted on:2021-04-03Degree:MasterType:Thesis
Country:ChinaCandidate:X H GengFull Text:PDF
GTID:2428330614458178Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of 5G,artificial intelligence,could computing and other information technology,the matching data stream mining algorithm can not meet the current needs,and data mining in data stream is increasingly concerned by researchers.However,there are many limitations to data mining in data stream,its memory is limited,and the requirements for mining algorithm are higher,making data mining in data stream more challenging.Association rule mining is an important part of data mining.It mines potential relationship between different transactions and different attributes.In this thesis,frequent itemsets and maximal frequent itemsets in association rules are mined based on data stream.In the process of data mining,an efficient data compression structure is used to compress data,a superset detection strategy is used to reduce data volume,and an efficient method is used to calculate support.The frequent itemsets mining in data stream is deep studied and analyzed from multi-directional and multi-angle.The main contents are as follows:Firstly,this thesis studies and improves the classical frequent itemsets mining FIUT-Stream algorithm in data stream,and an efficient frequent itemsets mining algorithm in data stream is proposed.The improved algorithm uses the common sliding window to process data stream,and uses an efficient bit table for data compression.It directly operates the bit table when calculating the support,and calculates the support by and operations to achieve fast calculation of the support.Through the use of and operation on all itemset's item,the support of the itemset can be calculated,so the algorithm can mine frequent itemsets.At the same time,it can reduce the amount of mining data by using superset detection strategy in the process of frequent itemsets mining.Experimental results show that the improved algorithm has higher mining efficiency under the premise of ensuring that the frequent itemsets mined are accurate and effective.Next,on the basis of the above improved algorithm,this thesis proposes a mining algorithm of the maximal frequent itemsets in data stream that has good time efficiency and space efficiency.The algorithm continues to use sliding window to process data stream,and uses an efficient compression bit table to compress data.When the data insliding window is filled with new data,simple addition and subtraction are used to update the support.Compared with the mining of frequent itemsets,it reduces several mining level.At the same time,the mining starts from the longest itemsets,and combined with the relevant properties of the maximal frequent itemsets,that reduces the amount of mining in the process of the maximal frequent itemsets.The experimental results under different data sets and multiple experimental parameter changes show that the algorithm has a good effect when mining the most frequent itemsets.At the end of the thesis,research prospects are made in terms of precision mining,concept drift,and the use of emerging technologies,and some possible research directions in the future are proposed.
Keywords/Search Tags:big data, data stream, data mining, frequent itemsets
PDF Full Text Request
Related items