Font Size: a A A

The Research On The Algorithm Of Mining Frequent Patterns Over Data Streams

Posted on:2011-04-08Degree:MasterType:Thesis
Country:ChinaCandidate:W HuangFull Text:PDF
GTID:2178360332457615Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of information technology, massive database increased rapidly and the lack of analyse process technology is gradually appearing. This demand provides a great boost to the emergence of the Knowledge Discovery in Databases, or KDD. Data mining is an important process of KDD. In the process, intelligent algorithms are used to discover interesting patterns from large amounts of data. Frequent pattern mining is a very important problem in data mining. Recently, large amounts of data are accumulated in the form of data streams, such as web data and transaction data. Unlike traditional static databases, the features of data streams, such as consecution, disorder and real-time pose many new challenges for mining data streams, and mining frequent patterns over data streams has become current research difficulty and hotspot.The paper mainly study one of data stream mining problem—mining frequent patterns over data streams, The detail research achievements are listed as follows:Firstly, introduce data streams mining technology and their characteristics. then introduces the basic conceptions and key problems, At last, study several typical algorithm of mining frequent patterns in data streams.Secondly, a new Prefix-stream algorithm based on landmark window for mining frequent patterns over data streams is proposed. A new data structure P-tree that is given in the paper is used for mining, maintaining and updating frequent patterns over data stream at the same time. The algorithm can also differentiate the patterns of recently generating transactions from those of historic transactions with a logarithmic tilted-time window. The experimental results show that the proposed algorithm outperforms the previous FP-stream algorithm.Finally, a new PSW algorithm based on sliding window for mining frequent patterns in data stream is proposed. A sliding window is divided into several basic windows and the basic window is served as an updating unit. A compact PSW-tree is used to mine frequent patterns in the basic window and maintaining all the frequent patterns. The obsolete and infrequent items are deleted. The experimental results indicate that PSW algorithm performs efficiently.
Keywords/Search Tags:Data Mining, Data Streams, Frequent Pattern, Frequent Pattern Tree, landmark window, sliding window
PDF Full Text Request
Related items