Font Size: a A A

Research On Frequent Itemsets Mining Algorithm In Data Stream

Posted on:2018-10-26Degree:MasterType:Thesis
Country:ChinaCandidate:H Q LiFull Text:PDF
GTID:2348330569486409Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of large data real-time analysis application,mining frequent itemsets in data stream has gradually become a hot issues.The data stream has the characteristics of high speed,infinite and unpredictable,and may also contain the concept drift character.Therefore,in the process of data mining in data stream,computing resources and memory space need to be considered more.This thesis mainly completed the following research work:Firstly,aiming at the concept drift characteristics of data stream,this thesis proposes an algorithm for stream data frequent itemsets mining based on variable sliding window,namely VSW-SCPS.This algorithm maintains a tree structure SCPS-tree in the memory to keep the sliding window data.When the data flowing in,the SCPS-tree will be dynamically adjusted and the window size changes according to the detection of concept drift.Compared with the VSW algorithm,VSW-SCPS is more efficient.However,the variable sliding window is expanding continuously,thus affecting the mining efficiency to a certain extent.Secondly,to solve the problems of VSW-SCPS,this thesis also devises the LVSW-OSCPS algorithm to mine the maximum frequent itemsets for stream data based on limited variable sliding window.This algorithm improves the variable sliding window model VSW,and further proposes a limited variable sliding window model LVSW.At the same time,it combines the properties of ordered FP-tree,improved SCPS-tree structure and proposed OSCPS-tree.Hence,there is no superset detection in the process of mining the maximum frequent itemsets,thus simplifying the establishing of condition subtree and accelerating mining.The experimental results verify the effectiveness of LVSW-OSCPS.The research work shows that the research of streaming data mining algorithm based on frequent itemsets can effectively improve the mining efficiency of frequent itemsets,which has important theoretical and practical significance.
Keywords/Search Tags:data stream, frequent itemset, variable sliding window, concept drift, maximal frequent itemset
PDF Full Text Request
Related items