Font Size: a A A

A Method Based On The Vertical Division Of Data Stream Frequent Itemset Mining Algorithm

Posted on:2012-05-17Degree:MasterType:Thesis
Country:ChinaCandidate:J B ZhuFull Text:PDF
GTID:2218330368982946Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As a new format of data, Data stream widely exists in the real world, for example, sensor network monitoring, climate monitoring data, phone call records, network communication monitoring, real-time stock, web users clicking and counter log of site visiting, etc. In all these fields, a lot of data stream came into being. Consequently,Frequent itemsets mining technology, both at home and abroad, has been a research hotspot in the data streams mining..Traditional frequent itemsets mining algorithms are mostly designed to run in the hardware environment based on centralized uniprocessor. Once data arrival rate and the scale of data increases rapidly, a series of problems will appear such as data loss, error gain, throughput degrading, etc. To this end, a new algorithm is proposed on the basis of FP-Stream in this paper:parallel frequent itemsets mining algorithm. The algorithm, running in the hardware environment based on distributed multiprocessor, adopts the strategy of parallel mining frequent itemsets and integrating. First of all, it vertically divides global data into sub-data in different time periods and then mines. After sub-consolidation system integrates every sub-frequent itemsets. the new algorithm also use the data structure of tilted-time window. ultimately, it provides search service. Result of the algorithm is also characterized by time. Most of all, as the design is based on distributed, throughput can synchronously increase with the scale of data as hardware environment allow them to be.By analyzing the results of simulation experiments, although it additional increace some space-need,the algorithm based on vertical division databases has a bigger throughput than FP-Stream and can solve the problems in traditional frequent itemsets mining, the applied propect would be expanded further.
Keywords/Search Tags:data stream, frequent items mining, paralleled, FP-Stream, tilted-time window
PDF Full Text Request
Related items