Font Size: a A A

Clustering And Outlier Detection Upon Trajectory Streams

Posted on:2019-03-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:J L MaoFull Text:PDF
GTID:1368330563955381Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the widespread application of modern mobile devices and the vigorous devel-opment of location acquisition technologies,numerous moving objects relay their loca-tions continuously,including the longitude and latitude coordinates,speed,direction and timestamp,etc.Therefore,a tremendous amount of positional information is accumulat-ed in the form of centralized trajectory stream or the distributed trajectory streams.This necessitates analyzing the trajectory streams timely and effectively to gain insights about the evolutionary moving behaviors of the objects,and further indicate the possible events behind the moving patterns.Clustering and outlier detection are two typical moving pattern discovery techniques upon the trajectory data.As an unsupervised approach,clustering aims to group a large amount of trajectories into numerous comparatively homogeneous clusters to extract the representative paths or the common moving trends shared by various objects.Conversely,the main task of outlier detection is to identify a few trajectories which appear to be signif-icantly distinct from the remainder of that set of data,and further to discover the abnormal events.Designing efficient mechanisms for clustering and outlier detection upon trajec-tory streams can facilitate a broad range of time-critical applications,such as intelligent transportation management,route planning,and road infrastructure optimization,etc.As the trajectory stream is a continuous,infinite sequence of positions accompanied and ordered by the explicit timestamps,little effort has been devoted to clustering and outlier detection upon streaming trajectories.The main challenge also comes from the strict space-and time-complexities of processing the continuously arrived trajectory data,combined with the difficulty of concept drift.In the meanwhile,due to the constraint factors like skewness distribution and evolving nature of trajectory data,it is challenging to put forward high effective method for clustering and outlier detection upon trajectory streams.Even more difficult is exploiting the high-precision solution over the distributed streams,to meet the requirement of on-the-fly execution with minimal communication overhead among the nodes.To address the above issues,in this dissertation,we are com-mitted to developing clustering and outlier detection techniques upon the centralized and distributed trajectory streams.The major contributions of this dissertation are listed as follows.1.Clustering Analysis upon Trajectory Streams.On the basis of the sliding window model,we proposed a framework to cluster the evolving streaming trajectories,called OCluWin.It contains a micro-clustering component to cluster and summarize the most recent sets of trajectory line segments at each time instant,and a macro-clustering component to build large macro-clusters based on the micro-clusters over a specified time horizon.Then we presented two methods(named as TSCluWin and OCluST respectively)based on two defined synopsis structures(EFo as well as EF),which can extract the spatio-temporal clustering characteristics of the stream data in memory,and track the latest cluster changes of the trajectory streams in real time.Theoretical analysis and experimental results on the real trajectory data sets showed that our proposal could achieve the superior performance in clustering streaming tra-jectories.2.Online Clustering upon Distributed Trajectory Streams.To tackle the issue of ef-ficient clustering on the trajectory streams derived by multiple disperse nodes,we first presented a distributed synopsis structure to extract clustering characteristic of trajectory data.Then,combined with sliding window model,we designed a two layer distributed framework and then developed an incremental algorithm for online clustering over distributed trajectory streams(called OCluDTS).It contains a parallel local clustering component to cluster and summarize the most recent sets of trajec-tories on each remote site,and a global clustering component of the coordinator to build the global clusters on the received local synopsis structures.Moreover,the pruning mechanism of similarity calculation and the optimization strategy that test-ing first and transferring later enabled OCluDTS algorithm to boost the efficiency.Theoretical analysis and comprehensive experimental results on the real world data set demonstrated that OCluDTS algorithm was of high quality and high scalability on the distributed trajectory streams.3.Feature Grouping-based Outlier Detection upon Streaming Trajectories.In actual applications,it is observed that trajectory outlier may manifest as the significant d-ifference of motion behavior from between a trajectory and its local neighbors.To detect the aforementioned outlier upon the trajectory streams,we first proposed a feature grouping-based mechanism that divides all the features of trajectory data into two groups,including Similarity Feature and Difference Feature.For the trajectory fragments that obtained by trajectory simplifying,we searched the close neighbors for each trajectory fragment according to Similarity Feature,and then identified the outliers within the similar neighborhood in terms of Difference Feature.Based on the feature differences among local adjacent objects in one or more time interval-s,we presented two outlier definitions,including local anomaly trajectory fragment(TF-outlier)and evolutionary anomaly moving object(MO-outlier).Furthermore,we devised a basic solution and then an optimized algorithm to detect both types of outliers.Experimental results on three real data sets validated our proposal for effectiveness and efficiency.4.Outlier Detection over Distributed Trajectory Streams.In order to capture the behav-ior outlierness of each trajectory in relation to its local neighborhood upon the dis-tributed streams,we presented trajectory outlier definitions to characterize the anoma-ly trajectory fragment,the anomaly fragment cluster and the evolutionary anomaly object in distributed trajectory streams.On the basis of that,we proposed the first scalable decentralized outlier detection algorithm over distributed trajectory streams,called ODDTS.With the aim of continuously providing feature-grouping based out-liers detection over distributed trajectory streams,it consists of remote site processing(F-outlier and FC-outlier detection)and coordinator processing(EO-outlier).Our proposal could achieve obvious performance gains through parallel outlier detection mechanism with the minimal transmission overhead among the nodes.Extensive ex-periments over real trajectory data demonstrated high detecting validity,less commu-nication cost and linear scalability of ODDTS method for online identifying outliers upon distributed trajectory streams.All in all,this dissertation focused on the clustering analysis and anomaly detection technology upon trajectory streams,and conducted the detailed analysis around four basic issues.As the sliding window model is one of the basis models for processing trajectory streams,and has advantage on eliminating the effects of obsolete data,how to incre-mentally cluster on the continuously arrived trajectory stream data over sliding windows becomes a basis issue.The trajectory streams are collected by the distributed nodes in more and more applications.How to extend the existing clustering analysis technique of the centralized trajectory streams and design the appropriate clustering method for the distributed trajectory streams also becomes a basis issue.This issue not only needs to take into account obtaining the clustering result of high quality,but also must guarantee high efficiency.In actual applications,trajectory outlier usually has significant behavioral difference with its local spatial-temporal neighbors.In view of this,how to measure the moving behavior outlierness of each trajectory based upon the local spatial-temporal neighborhoods,and further identify the abnormal trajectory and abnormal objects be-comes a basic issue.Additionally,how to design the effective outlier detection technique upon the distributed trajectory streams becomes a basic problem to be solved.The solu-tion of this problem must be capable of identifying the abnormal trajectory or trajectory cluster that has the obvious behavioral difference within its local spatial-temporal neigh-borhoods on each node,and then detecting the abnormal evolving moving objects in the distributed environment.To tackle the above mentioned basis issues,this dissertation developed the basic approaches and optimized ones to provide the solutions.Our work was based on a detailed analysis of existing theories,techniques and methods.Theoretic analysis and extensive experimental results on the real data showed that our proposals could solve the aforementioned issues efficiently,and had significant advantages on result quality and execution efficiency.
Keywords/Search Tags:Trajectory Stream, Clustering, Concept Drift, Outlier Detection, Distributed, Scalable
PDF Full Text Request
Related items