Font Size: a A A

Research And Implementation Of Calculating The Similarity Between Massive Spatio-temporal Track

Posted on:2020-08-05Degree:MasterType:Thesis
Country:ChinaCandidate:G K TuFull Text:PDF
GTID:2428330572967249Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Spatiotemporal trajectory(Trajectory)is the sequence that record location and time of moving objects.As an important data type about spatio-temporal object and information source,spatio-temporal trajectory covers many aspects,such as human behavior,traffic logistics,emergency evacuation management,animal habits and marketing.Through clustering analysis of various spatio-temporal trajectory data,similarity and anomaly features can be extracted from spatio-temporal trajectory data,and meaningful patterns can be found.In view of the sensitivity of traditional LCSS algorithm in selecting time threshold when comparing trajectory points,a new LCSS+ algorithm is proposed,which has stable performance and high recognition rate under different time threshold conditions.Aiming at sparseness of space-time trajectory data,a large number of comparison between invalid trajectory point is caused.In this paper,grid-based partitioning algorithm is applied to LCSS+,which greatly reduces the number of comparison times between trajectory point and improves it.The efficiency of the algorithm is improved.To solve the problem of large amount of trajectory data,a distributed LCSS + algorithm is proposed.The experiment ‘s result shows that the distributed LCSS + algorithm can shorten the comparison time and improve the real-time performance of large data sets.At the same time,due to the unevenly distribution of data,when using MapReduce to calculate the similarity of trajectories,trajectory data can't be evenly allocated from map side to reduce side,making some Reduce nodes heavily loaded.The task on the heavily loaded nodes takes a long time,the entire MapReduce task waits until all reduce tasks are completed,making the overall task runs longer.In view of the above problems,an optimization method is proposed.Firstly,sampling the original trajectory data and counting the frequency of key's value in the trajectory data,the distribution of the whole trajectory can be calculated.Secondly,aiming at the shortcomings of the default partitioning algorithm,an improved partitioning algorithm is proposed to deal with the intermediate result coming from Map side.Through the experiments of job's running time and load on Reduce side,the improved partitioning algorithm is compared with the default hash partitioning algorithm.The experiment's result shows that the improved partitioning algorithm can perform better than the default hash partitioning algorithm in dealing with the trajectory data of large skewness.
Keywords/Search Tags:Spatiotemporal trajectory, LCSS, Ajoint model, Hadoop
PDF Full Text Request
Related items