Font Size: a A A

Research On Multi-stream Frequent Item Set Mining Algorithm

Posted on:2018-07-16Degree:MasterType:Thesis
Country:ChinaCandidate:X WangFull Text:PDF
GTID:2358330518968282Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology in many fields,the form of network data presents the trend of diversification.As a new type of data,data stream has been widely used in many fields.For example,the data in the sensor network environment,the financial data in the financial application and the location data obtained by the GPS positioning system.In the face of infinite,continuous,high speed and ordered massive data,the traditional data mining technology is difficult to be applied directly to find the effective information in the massive data stream.Therefore,the problem of data stream mining is of great significance.In this paper,the algorithm for mining frequent itemsets in multiple data stream is studied.First,this paper describes background and significance of the topic,and summarizes the research status of the subject at home and abroad.Secondly,this paper describes the application of data processing technology.For example,data stream mining technology,data stream frequent itemsets mining technology and three kinds of classical data stream frequent itemsets mining algorithm.Finally,we propose two algorithms for mining frequent itemsets based on multiple data streams.In this paper,there are three aspects included in the main work:(1)To study the data structure of data frequent pattern mining algorithm,and design an improved data structure that was based on FP-tree structure of data frequent pattern mining algorithm.Based on the further study of the characteristics and forms of data flow,we design a data storage structure of prefix tree based on dictionary sequence.The window model can incrementally update and retain the count value of frequent itemsets,which can improve the utilization ratio of memory space and the space complexity of the algorithm to a certain extent.(2)To study the algorithms for mining frequent patterns in multiple data streams,we design an efficient algorithm for mining cllaborative frequent itemsets in multiple data streams.The problem of cllaborative frequent itemsets in multiple data streams is proposed for the first time.The cllaborative frequent itemsets is a group of objects in a very short period of time on the same data stream to accompany the state of frequent occurrence,and the group of objects in the same way in many data streams.Firstly,through the algorithm for mining frequent itemsets with a time-interval sliding window based on bit-sequence to found the potential frequent itemsets and frequent itemsets in the data stream;secondly,construct the frequent pattern tree is used to store potential frequent itemsets and frequent itemsets in multiple data streams,and incrementally update corresponding itemset the frequency of the log-inclined table;finally,analysis collaborative frequent itemsets in multiple data streams.(3)To study the algorithm for mining frequent itemsets based on multiple data streams in distributed environment,we design an efficient parallel algorithm for mining cllaborative frequent itemsets in multiple data streams.Under the background of the big data,the scale and speed of the data stream is increasing rapidly.The computing power of a single computing node with the limited memory space can't afford such a huge amount of data.Therefore,the traditional centralized frequent itemsets mining algorithm is difficult to cope with the increasing size of the data model.To solve this problem,this paper adopts the parallel computing model and design the distributed index structure that can be easily distributed to different computing nodes and mining collaborative frequent itemsets in multiple data streams with distributed environment.
Keywords/Search Tags:Data Mining, Multiple Data Stream, Sliding Window Model, Parallel Algorithm, Frequent Itemsets, Collaborative Frequent Itemsets
PDF Full Text Request
Related items