Font Size: a A A

Research On Optimization Of Data Stream Frequent Itemsets Mining Algorithm Based On Sliding Window

Posted on:2019-06-26Degree:MasterType:Thesis
Country:ChinaCandidate:J LiuFull Text:PDF
GTID:2438330545990748Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of information technology,medical,e-commerce and other industries are developing rapidly,and the amount of data accumulated in all walks of life is increasing.However,there is a lack of processing technology for effective analysis.Because people are eager to get knowledge from the data,data mining emerges as the times require.Data mining technology refers to extracting useful patterns from data.Data stream is a high-speed,continuous data set,which is different from the traditional data in the static database.Data stream has the characteristics of continuous,real-time,infinite and so on,and it is a fast coming data.Based on the characteristics of data stream,traditional mining algorithms and technologies are hard to apply to data streams.So many scholars have done researches on data stream mining algorithms.The frequent item set mining of data stream has also become one of the main problems in the data mining task.This paper analyzes the advantages and limitations of the existing algorithms for mining frequent itemsets,focusing on the mining algorithms of frequent itemsets and frequent closed itemsets.In view of the research of the above algorithm,this paper proposes the corresponding optimization algorithm.The main research contents are as follows:First,the background and current situation of data mining are introduced.Then,the task of data mining technology is outlined,and the data stream and its characteristics are introduced.Finally,several typical algorithms for mining frequent itemsets of data streams are analyzed and discussed.Second,an improved frequent itemset mining algorithm(SWFI)based on sliding window is proposed.The data stream is grouped and stored in SWFI-tree.When windows are filled for the first time,they use sliding windows to delete those transactions that first entered the window,then read new transactions,and finally generate frequent itemsets by mining trees.The experimental results show that the SWFI algorithm has good stability and timeliness,and is suitable for mining frequent patterns under data stream.Finally,in order to meet the needs of actual problems,a sliding window based data stream closed frequent pattern mining algorithm SW_MFCI is used.The algorithm based on sliding window structure,used the mergement of twin item set,reduced the number of generator.Using the structure of the hash table structure to store frequent closed itemsets and superset,subset pruning.The experimental test and performance comparison analysis show that the SW_MFCI algorithm is feasible.
Keywords/Search Tags:data mining, data stream, sliding window, frequent itemsets, frequent closed itemsets
PDF Full Text Request
Related items