Font Size: a A A

Research On Data Stream Frequent Itemsets Mining

Posted on:2008-11-26Degree:MasterType:Thesis
Country:ChinaCandidate:X S ZhengFull Text:PDF
GTID:2178360212968339Subject:Computer applications
Abstract/Summary:PDF Full Text Request
A data stream is an ordered sequence of items that arrives in timely order. Different from data in traditional static databases, data streams are continuous, unbounded, usually come with high speed and have a data distribution that often changes with time [1]. Traditional mining algorithms are difficult to cope with data stream due to its characteristic. Many researchers have studied mining frequent itemsets in data stream. And now, mining frequent itemsets in data stream is one of the most basic problems in data mining.According to the characteristic of data stream, the paper researches and summarizes the technique of data steam processing, issues in data stream mining. The paper researches some technique of solving the issues. The paper introduces some classical frequent itemsets and does some experimentation. Through experimentation and analysis, it is difficult to make classical frequent itemsets mining algorithms to extend to data stream because of the limitless and high speed of data stream. Besides, the paper introduces, analyzes and summarizes some existent data stream mining algorithm.Finally, the paper proposes FP-CountMin algorithm. The algorithm partitions the data stream and uses modified FP-growth algorithm to mining frequent itemsets in every segment. And then, it counts itemsets in Count Min Sketch. The algorithm solves the problem of compressed statistic and effective computation. Through experimentation and comparision with FP-DS algorithm, we find FP-CountMin algorithm has a good time efficiency.
Keywords/Search Tags:data stream, data mining, data stream mining, frequent itemsets
PDF Full Text Request
Related items