Font Size: a A A

Efficient indexing and query processing techniques on spatial time series data

Posted on:2006-12-27Degree:Ph.DType:Dissertation
University:University of MinnesotaCandidate:Zhang, PushengFull Text:PDF
GTID:1458390008462837Subject:Computer Science
Abstract/Summary:
The explosive growth in size and spatio-temporal nature of data collected by advanced data collecting tools, such as satellites, sales transactions, medical instruments, and sensors, pose significant challenges for data analysis. The typical data---spatial time series data---are a collection of time series, each referencing a location. Researchers often retrieve interacting relationships among observations in spatial time series data by finding highly correlated time series. For example, such queries were used in the investigation of ocean teleconnections, i.e., identifying the land locations on the Earth where the climate was often affected by the El Nino, the anomalous warming of the eastern tropical region of the Pacific. However, such correlation queries are computationally expensive due to large number of spatial locations and long time series. Previous methods ignored intrinsic spatio-temporal properties in processing correlation queries, and thus their efficiencies deteriorate substantially for large data.; The major contributions of this work lie in proposing a novel spatial cone tree indexing structure and designing efficient query processing algorithms to facilitate correlation queries in spatial time series data by exploiting the spatial autocorrelation. The spatial autocorrelation is the property that the values of attributes in nearby spatial locations tend to be similar. The spatial cone tree abstracts groups of time series in space proximity into a disk page by a cone around a single time series. The spatial cone tree can be used to design hierarchical filter and refine query processing strategies to eliminate large amounts of computational costs without affecting the query accuracy. Algebraic analyses using cost models and experimental evaluations with long term climate data from the National Aeronautics and Space Administration (NASA) were carried out to show that the proposed indexing structure and query processing algorithms saved a large portion of computational cost. The proposed techniques have been successfully used in the investigation of teleconnections in NASA's Earth science data. These techniques have the potential to be extended to many application domains including the NASA, the National Geospatial-Intelligence Agency (NGA), the National Cancer Institute (NCI), and the United States Department of Transportation (USDOT).
Keywords/Search Tags:Spatial, Time series, Data, Query processing, Techniques, Indexing
Related items