Relational discovery in sequentially-connected data streams: Efficient algorithms for lossless pattern discovery and change detection

Posted on:2006-11-09

Degree:Ph.D

Type:Dissertation

University:The University of Texas at Arlington

Candidate:Coble, Jeffrey Allen

Full Text:PDF

GTID:1458390008973591

Subject:Computer Science

Abstract/Summary:

We are developing relational data mining techniques that discover structural patterns consisting of complex relationships between entities. Our research is particularly applicable to domains in which the data is event driven, such as counter-terrorism intelligence analysis. Such analytical tasks require discovery of relational patterns between events and actors so that these patterns can be exploited for prediction and action. An additional complexity of these event-driven problems is that they are often continuous, with data streaming in over a long period of time or even indefinitely. This presents the need for discovery algorithms to repeatedly assimilate new data into the discovery process. However, reprocessing the accumulated data after receiving each new increment is often an intractable task because of the computational demands of most relational discovery methods.; We have developed an algorithm to mine relational data streams by summarizing discoveries from previous data increments so that the globally-best patterns can be computed by examining only the new data increment and isolated sets of sequentially-connected data that spans increment boundaries. Our algorithm will find pattern instances that span increment boundaries by using a targeted, localized search, based on the set of globally-best substructures, and an efficient graph exploration technique that restricts the range of the graph that must be explored.; Many continuous problems are also dynamic in nature, requiring discovery algorithms to be capable of recognizing and adapting to change over time. We introduce an algorithm with which we are able to use a measure of central tendency for a set of graphs, used to compute a representative point in graph space. We use these points, along with a distance metric, to measure change in sequential sets of the best patterns discovered from successive data increments. The objective of this work is to enable a method for measuring pattern drift in relational data streams, where the salient patterns may change in prevalence and structure over time. With a measure of central tendency for graph data, along with a method for calculating graph distance, we have a framework with which we can begin to adapt time-series techniques to relational data streams.

Keywords/Search Tags:

Relational, Data streams, Discovery, Pattern, Sequentially-connected data, Change, Algorithm

Related items

1	Approach To Dynamic Pattern Discovery And Trace In Data Streams
2	Research Of Pattern Discovery Algorithm Over Data Streams Based On Directed Graph
3	Research On Pattern Discovery And Storage Of XML Data
4	Research On Improvement Of High Utility Pattern Mining Algorithm Over Data Streams
5	Incremental pattern discovery on streams, graphs and tensors
6	High Perfermance Data Stream Pattern Discovery Algorithms And Their Applications
7	Structural Model Discovery in Temporal Event Data Streams
8	Research On Classification Algorithm Based On CAPE Over Data Streams
9	Research On Algorithm For Relational Data Classification Based On Background Knowledge
10	The Research On The Algorithm Of Mining Frequent Patterns Over Data Streams