Font Size: a A A

The Research On Algorithm Of Frequent Itemsets Mining In Data Streams

Posted on:2015-02-04Degree:MasterType:Thesis
Country:ChinaCandidate:C P BaiFull Text:PDF
GTID:2268330428982556Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Frequent itemset mining is one of the important subjects of data mining, which has been studied extensively in the last decade. It is used by many data mining applications, such as the discovery of association rules, correlations, sequential rules and episodes.Recently, there has been much interest in data arriving in the form of continuous streams, which is named data streams. Data streams arise in several application domains like high-speed networking, transaction logs, finance and sensor networks. Data streams possess unique computational characteristics, such as unknown or unbounded length, possibly high arrival rate, inability to backtrack over previously arrived items (only one sequential pass over the data is permitted), and a lack of system control over the order in which the data arrive. Among the researches toward data streams, extending mining techniques to data streams has attracted much attention. However, most algorithms for frequent itemset mining have typically been developed for datasets stored in persistent storage and involve multiple passes over the dataset, so they cannot be directly applied to data streams.This thesis discusses the weighted sliding window model and proposes a frequent itemsets mining algorithm named FIMWSW which is suitable for the weighted sliding window model.The thesis optimizes and improves FIMWSW,meanwhile,we introduce a new algorithm named FIMWSW-Imp.This work include:(1)We make improvements for sliding window data stream model and present a weighted sliding window data stream model;(2) We design an frequent itemsets mining algorithm nemed FIMWSW which is suitable for the new model. But the algorithm can produce a great number of candidate itemsets, thus we present an improved algorithm named FIMWSW-Imp.We provide a detailed experimental study that demonstrates the FIMWSW-Imp-algorithm outperforms the FIMWSW-algorithm.Extensive experimental results highlight significant gains in scalability and efficiency on both sparse and dense datasets at all levels of support threshold.
Keywords/Search Tags:Data streams, Frequent itemsets, Sliding window
PDF Full Text Request
Related items