Font Size: a A A

An In-Memory Data Structure for Targeted Association Rule Mining in Time-Varying Domains

Posted on:2014-09-12Degree:Ph.DType:Dissertation
University:University of Louisiana at LafayetteCandidate:Lavergne, Jennifer Jean StutesFull Text:PDF
GTID:1458390008453796Subject:Computer Science
Abstract/Summary:
Recently, with companies and government agencies saving large repositories of time stream/temporal data, there is a large push for adapting association rule mining for dynamic, targeted querying. In addition, issues with data processing latency and results depreciating in value, with the passage of time, create a need for swifter and more efficient processing. The aim of targeted association mining is to find potentially interesting implications in large repositories of data. Using targeted association mining techniques, specific implications that contain items of user interest can be found faster and before the implications have depreciated in value beyond usefulness.;This dissertation combines the discovery of frequently and rarely occurring implications with dynamic targeted association mining to discover emergent, current, and declining rare/frequent implications of user's interest in time stream data. The Itemset Tree data structure for targeted mining was modified to create an Ordered Min-Max Itemset Tree to decrease the number of extraneous nodes visited for a particular query search. This augmented tree structure allows the exploitation of the ordered nature of the tree and the knowledge of the minimum and maximum values contained within a subtree to allow a subtree to be skipped if it fails certain requirements, as well as achieve early tree search termination. Next, the Itemset Tree querying algorithm was modified to discover both rare and frequently occurring patterns. Due to the linear nature of the querying process, rare patterns could be discovered as efficiently as frequently occurring patterns, unlike in traditional association mining methods. Finally, in order to enable the Itemset Tree algorithms for dynamic implication discovery, a processing loop was developed. This loop begins by adding nodes to the current tree and removes nodes which are no longer relevant. Next, the tree for the current time window is processed using user defined queries and frequent and rare patterns are discovered. Finally, association rules are discovered within the pattern sets and returned. This new algorithm observes patterns as they emerge, decline, and remain linear over a given dataset and time windows. Examples of implications such as these are unexpected increases in traffic accidents and sudden epidemic outbreaks.
Keywords/Search Tags:Data, Time, Targeted association, Mining, Implications, Itemset tree, Structure
Related items