Font Size: a A A

Research On Algorithm For Mining Frequent Closed Itemsets Over Data Streams

Posted on:2012-07-04Degree:MasterType:Thesis
Country:ChinaCandidate:S LaiFull Text:PDF
GTID:2178330335466790Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The emergence of the data stream brings tremendous challenges to the traditionaltechnology in data mining. Because the data stream is arriving continuously, to manage andmine these potentially unlimited and dynamic data stream is difficult for existing dataprocessing techniques. With the wide application of mobile terminal equipment, the datastream applications continue to increase. Therefore, people must make a study of miningskills suitable for the environment of data stream. It has attracted extensive attention fromscholars at home and abroad, and has become a research hotspot.In the field of research on data stream mining algorithm, frequent itemsets mining is animportant research content. It is widely used in association rules, iceberg query, classificationsand clustering. The most of the traditional methods focus on mining all of frequent itemsetsover data streams to exist data and model redundancy. The frequent closed itemsets preservecompletely information of all frequent itemsets and the scale is much smaller than thefrequent itemsets. Therefore, in recent years, people starts to focus on mining frequent closeditemsets over data streams.In this thesis, we explore serial key issues over data stream mining. We primarilyresearch the problem of mining the frequent closed itemsets over data stream. Meanwhile weproposed a new algorithm and made the necessary analysis with the correspondingexperimental results. In summary, this article mainly related to the following aspects:1. Compared to traditional relational database, we analyze the characteristics of datastreams, and then introduce several data streams processing model, and the commonly useddata stream processing technology are summarized.2. We analyzed and summarized the characteristics of data stream mining algorithmsand models. And we introduced the current data stream mining algorithms. We analyzed someclassical frequent itemsets mining algorithms over data stream, and then understand ourselveswith following aspects involved in the data stream mining procedure, such as the storagestructures and storage methods, and summary data structure generation, maintenance andreal-time search results and so on.3. Frequent closed itemsets contains complete information about all frequent itemsets, thatis, the number is smaller than frequent itemsets. This paper studies the problem of frequentclosed itemsets mining over data streams. Then we propose an algorithm of frequent closeditemsets mining based on sliding window processing model to mine the most recentinformation of interest to the user, and store them to a new compress storage structure, which can not only get a incremental update and maintain the this structure, but also have a fastmining of all frequent closed itemsets over the sliding window. The experimental result showsthat this algorithm is effective.
Keywords/Search Tags:Data Streams, Data Stream Mining, Frequent Itemsets, Frequent ClosedItemsets, Sliding Window, Basic Window
PDF Full Text Request
Related items