Font Size: a A A

Mining Association Rules In Data Streams

Posted on:2008-06-18Degree:MasterType:Thesis
Country:ChinaCandidate:D HuFull Text:PDF
GTID:2178360272969429Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In early 1990s, R.Agrawal brought forward the concept of association rule mining which can discover the interesting association and correlative information among itemsets in vast data. By 10 years' development, association rule mining has become an important and perfect method in data mining technology.The data stream is a new type of data in recent emerging applications, such as Internet monitoring, financial applications, dynamic tracing of stock fluctuation, sensor networks and so on. Traditional data mining algorithms, which cope with stable database, cannot work on the high-volume, open-ended data streams. With the increase of data stream applications, researchers pay more and more attention to data mining in data streams.Different from traditional database processing, the data stream processing technology does not preserve the entire data set, but only maintains a data structure which is much smaller than its scale to express the entire data set. Thus the data structure can be stored in the memory. The users only aim at the data structure to carry on the inquiry operation, guaranteeing the system timeliness. The users obtain the approximate result. Sliding windowed streams are time-sensitive streams, which only consider the recent several elements in a data stream to sovle the time efficiency problem. It is more consonant with the actual applications. According to the characteristics of data streams, this paper proposes a new algorithm named DSFPM for mining frequent patterns in Sliding Windows over data streams. DSFPM mines frequent patterns batch by batch and maintains a DSFPM-Tree data structure to store all the potential frequent patterns which is updated after every new basic window enters into Sliding Windows. The frequent close itemsets is a smallest express of frequent itemsets without losing support information. The algorithm only saves and processes the sub-frequent close itemsets of each basic window which can improve the time efficiency and the space efficiency.Experiments on synthetic data sets have been designed and performed to check this application. At last the results of the experiments showed the feasibility and validity of DSFPM in mining frequent patterns over data streams.
Keywords/Search Tags:data streams, frequent patterns, Sliding Windows, data mining
PDF Full Text Request
Related items