Font Size: a A A

Mining Frequent Itemsets Over Recent Data Stream

Posted on:2011-11-19Degree:MasterType:Thesis
Country:ChinaCandidate:J HanFull Text:PDF
GTID:2178330332460943Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Quickly and accurately finding frequent items of large amounts of the data stream is an important basis for prediction and decision-making, this paper presents an approach about mining frequent itemsets on data stream within the current window. The study combines the sliding window techniques, frequent itemsets, genetic algorithm and parallel processing technology.Sliding window has been used in the network communication, time-series data mining, data stream mining and so on. This algorithm uses the sliding window to obtain the current data stream. We use the genetic algorithm to achieve the result mainly through crossover, mutation and selection. After several generations of selection, we achieve a final frequent itemsets. In this paper, we use standard pattern PGA (parallel genetic algorithm). When we establish the parallel part in the program, we can let this part run into GPU.First, the nested sliding window divides the data into data sets, and then the method use the parallelism and the global optimum and the capability of processing mass data of genetic algorithms to search for the frequent itemsets in sliding window. With the data stream flowing, this method is to capture the latest frequent itemsets accurately and timely on data stream. It is also periodically delete the expired data stream. As the use of nested windows and the parallel processing capability of genetic algorithm, this method reduced the space complexity and time complexity. Test proved that the method is effective and practical.
Keywords/Search Tags:data stream, frequent item sets, genetic algorithm, nested sliding window
PDF Full Text Request
Related items