Font Size: a A A

Research On Prediction Methods For Urban Dynamic Heterogeneous Spatio-temporal Data

Posted on:2024-03-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z Y ZhouFull Text:PDF
GTID:1522306932957709Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In recent years,with the progress of urbanization and advances of mobile sensing technologies,the volumes of urban data are gradually increasing.Such spatio-temporal data usually come from various aspects of cities,such as traffic networks,air quality and electricity grids,where these data share a same property,i.e.,possessing a certain spatial distribution with the spatial distribution varying over time.We therefore designate them as spatio-temporal urban data.At the age with data explosion and rapid development of data-empowered algorithms,data are values.Taking an in-depth insight into the complex characteristics of urban data,as well as mining the essential regularity of spatio-temporal data,enable quantified decision-making basis for various urban applications,including pre-deployment of emergent urban events,management of traffic congestion,and governance of urban environment,where these applications further empower the high-speed urban development.Therefore,constructing an accurate and reliable spatio-temporal prediction algorithm becomes an important project to urban development and governance.With the prosperity of deep learning and representation learning,a series of researches related to spatio-temporal forecasting,including traffic predictions and travel pattern mining,have been widely carried out.However,due to the inherent dynamic and heterogeneous characteristics of spatio-temporal urban data,existing deep learning models and data mining methods are not fully applicable in new scenarios.This dissertation summarizes four main characteristics of spatio-temporal data inherit from ’heterogeneity’ and ’dynamic’,and dissects corresponding challenges to accurate and reliable predictions.(1)The spatio-temporal heterogeneity,that is,the observations of different spatial regions reveal various patterns due to heterogeneous regional functions while observations within the same region also show different temporal patterns due to periodicity and phase-level heterogeneity.In this way,such timevarying characteristics further induce the heterogeneity and diversity of spatio-temporal correlations along with time,which brings challenges to spatio-temporal dependence modeling.(2)Multi-sourced but sparse,data within the same spatial and temporal domain can include multiple sources,but the data from each source tends to be sparse or imbalanced due to the inherent sparsity or sparse distribution of sensing devices.This poses challenges to both accurate and reliable predictions.(3)Uncertainty,which usually comes from the data fluctuation,can be further decomposed into data uncertainty and model uncertainty.These two kinds of uncertainties are respectively stemmed from intrinsic data noise and the insufficient knowledge awareness of model itself,thus they are also named as aleatoric and epistemic uncertainty.These two types of uncertainty bring in challenges to the prediction reliability.(4)Distribution shift property.The distribution will change by the accumulation of small fluctuation,that is,the distribution of spatio-temporal data will be different when it is with varying environmental context factors.This poses obstacles to the generalization of forecasting models.The above four characteristics jointly describe the ’heterogeneous’ and’dynamic’ characteristics of spatio-temporal data from the aspects of macro observation,model behavior,and distribution variations.In this dissertation,we model spatio-temporal prediction problems as spatiotemporal graph learning tasks by exploiting the powerful non-Euclidean modeling capacity within graph learning.Specifically,faced with above four characteristics and challenges,this dissertation carries out research by focusing on four critical issues,i.e.,adaptive aggregation on spatio-temporal graph,collaborative prediction over multisourced but sparse spatio-temporal data,uncertainty quantification on spatio-temporal learning,out-of-distribution generalization of spatio-temporal prediction.The main innovations and contributions of this dissertation are as follows:First,targeting spatio-temporal heterogeneity,this dissertation studies the issue of adaptive aggregation on spatio-temporal graphs.We first discover that topologytask discordance is the key factor contributing to the failure of accurate spatio-temporal aggregation,but the contextual factors and temporal evolution can be effective guidance to such aggregation.Therefore,this dissertation first proposes the dynamic graph homophily theory to measure the node-wise consistency,and constructs adaptive dynamic graph topology remediation based on target-homophily.Concretely,we take the temporal evolutions as targets and constructs the directional neighborhood aggregation via target homophily.Furthermore,we extend the neighboring aggregation to multiple graph neural network layers,which achieves layer importance based deep information aggregation,and customize the propagation depth for each node.This adaptive graph aggregation framework simultaneously realizes the customization of both aggregation direction and depth,and explicitly model the spatio-temporal heterogeneity,improving the prediction accuracy.Second,in view of multi-sourced but sparse spatio-temporal data,this dissertation investigates the problem of collaborative prediction over multi-sourced sparse spatiotemporal data.In this dissertation,we firstly categorize the data sparsity into intrinsic sparsity and fake sparsity.To this end,a data transformation strategy addressing zero-inflated issue in intrinsic sparsity,and a data collaborative modeling strategy dealing with fake sparsity,are proposed to improve the filling rate of spatio-temporal data.Based on above,we propose a spatio-temporal multi-granularity urban event prediction framework,to realize sparse event prediction challenges.We verify our sparse spatiotemporal prediction model with two traffic accident datasets,where the predictive results demonstrate the boosted the accuracy and reliability of collaborative prediction on multi-sourced sparse spatio-temporal data.Third,faced with the uncertainty of both spatio-temporal data and learning models,this dissertation tackles the problem of uncertainty quantification on spatio-temporal learning.First,we decompose uncertainty into epistemic one and aleatoric one,where epistemic one measures the sufficiency of knowledge learned from samples,while the aleatoric one is a data-related property,representing the fluctuations from both intrinsic noise and unobservable factors.Therefore,we propose to quantify the epistemic uncertainty with a sample density prober,and attribute the unobservable factor-induced aleatoric variation to the variances under the same context environments.Then the context and spatio-temporal variance guided aleatoric uncertainty learner is proposed.Given the critical uncertainty issue in data constrained scenarios,we formalize two non-consecutive forecasting scenarios to realize an uncertainty-aware spatio-temporal framework,where the rationale of uncertainty quantification and its bonus of improvement on prediction stability are further empirically verified by three traffic datasets.Fourth,in view of the distribution shift property in spatio-temporal data,this dissertation studies the problem of out-of-distribution(OOD)generalization for spatiotemporal prediction models.First,based on the principle of causal invariance,we transform the OOD generalization into the problem of discovering invariant associations in spatio-temporal data.We then construct a phase-aware sub-environment partition to tackle the phase heterogeneity issue,which also provides opportunity to capture hierarchical invariance with both global and local ones.To maximally capture diverse but relatively invariant spatio-temporal associations,we design a spatio-temporal consistency learner to extract both spatially bi-directional causal correlations and temporal trend consistency.Finally,a hierarchical invariance mining module is designed by capturing invariances within sub-environments and across sub-environments,achieving both local and global invariant spatio-temporal associations.We perform extensive experiments on three out-of-distribution generalization tasks,including covariate timing distribution shift,new node inductive inference and noise injection.The experimental results demonstrate that our solution achieves consistently superior results and improves the generalization capacity on spatio-temporal models.In this dissertation,we model spatio-temporal data as dynamic graphs and consider the task-associated environmental elements as contextual factors,and propose to exploit theories of graph topology learning,multi-source data fusion,uncertainty quantification,and causal invariance discovery to explore the intrinsic regularity of spatiotemporal data under various scenarios.We propose diverse new solutions and technologies,which cooperatively improve the prediction accuracy,reliability and generalization capacity.At the same time,we take node-level graph regression tasks such as urban region flow prediction,traffic event prediction and air quality prediction as typical pivot tasks to verify the correctness and superiority of our solutions,improving and developing key theories of spatio-temporal data mining.Finally,we integrate several spatio-temporal prediction algorithms proposed in this research,including but not limited to traffic prediction,road network risk forecasting,and non-consecutive traffic prediction,into a multi-granularity urban traffic management system,which further facilitates the application and landing of advancing scientific research achievements.
Keywords/Search Tags:Spatio-temporal data, Urban computing, Spatio-temporal heterogeneity, Graph Neural Network
PDF Full Text Request
Related items