Font Size: a A A

Spatial temporal data mining

Posted on:2000-07-27Degree:Ph.DType:Dissertation
University:University of California, Los AngelesCandidate:Wang, WeiFull Text:PDF
GTID:1468390014961745Subject:Computer Science
Abstract/Summary:
Spatial data mining, or knowledge discovery in spatial databases, is the extraction of implicit knowledge, spatial relations and discovery of interesting characteristics and patterns that are not explicitly represented in the databases. These techniques can play an important role in understanding spatial data and in capturing intrinsic relationships between spatial and nonspatial data. The amount of spatial data obtained from satellite, medical imagery and other sources has been growing tremendously in recent years. A crucial challenge in spatial data mining is the efficiency of spatial data mining techniques due to the often huge amount of spatial data and the complexity of spatial data types and spatial accessing methods. A STatistical INformation Grid-based approach (STING) [Wan97] was proposed to efficiently support many common “region oriented” queries on a snapshot of a set of objects. Moreover, objects may evolve over time. As a consequence, interesting patterns may emerge or disappear over time. It is preferable to have the system monitor an evolving database to determine when certain patterns specified by a user occur and initiate proper actions upon occurrence. STING+ [Wan99a] extends current spatial data mining techniques to support user-defined triggers, i.e., active spatial data mining.; When the number of attributes is large and/or the value of attributes evolve frequently, it is not desirable or even feasible to ask the user to specify interesting patterns due to the complexity of patterns and large number of potential patterns. In particular, strong correlation among different attributes and/or their evolution would be a kind of interesting pattern in many applications. We present a parameterizable model for temporal sequences of numerical attributes and devise efficient ways to search for parameter values that will result in a good fit to (at least a significant portion of) the data. Metrics for how well instances of the model fit portion of the data include the familiar measures of support and strength used in association rule mining and a new metric called density. A user specifies thresholds for these metrics and, based on structural properties of the class of models we are attempting to fit to the data, the search space can be drastically pruned using these thresholds.; As both the number of objects and the number of attributes considered are usually very large, it is essential to organize the set of objects by some dynamic indexing structure to facilitate the mining process. A new spatial index structure, the PK-tree [Wan98a], was designed for this purpose. By eliminating the “unnecessary” nodes due to skew data distribution, the PK-tree is able to support efficient data retrieval and update, achieve high storage utilization, provide strong support for concurrency control, and have very solid theoretical foundation, such as uniqueness, bounds on node fanout, tree height, and storage requirement, etc.
Keywords/Search Tags:Data, Spatial
Related items