Font Size: a A A

Research And Design On The Similarity Search System For Massive Trajectories

Posted on:2022-05-03Degree:MasterType:Thesis
Country:ChinaCandidate:W Z WangFull Text:PDF
GTID:2518306509494214Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of global positioning technology and wireless communication networks,trajectory data is easier to collect and use,providing important value for applications such as urban traffic planning,travel patterns mining,and points of interest recommendation.However,trajectory data has the characteristics of large data scale,different frequency sampling,poor data quality,etc.,which directly affect the mining effect and calculation efficiency of trajectory data.For this reason,the analysis and processing of massive trajectory data has always been the focus of attention in academia and industry.In the analysis and processing of massive trajectory data,trajectory similarity search has always been one of the key operations,which is the basis for realizing mobile behavior pattern mining,abnormal trajectory detection and other applications.However,when the scale of trajectory data is large,the efficiency of similarity search is extremely low.How to realize efficient trajectory similarity search is a research hotspot in recent years.This thesis mainly studies the similarity search method of massive trajectory data,and designs and implements the corresponding parallel system prototype.The specific work is as follows:This thesis proposes a trajectory similarity search algorithm based on a multi-level index structure.The multi-level index structure is composed of grid index and Start-End-indexFeature-point-index,so that in the process of trajectory similarity search,coarse-grained and fine-grained two filtering operations are performed on trajectory data,effectively improving query efficiency.In the indexing process,the first step is to mesh the trajectory data set,and number the spatial grid subsets to generate grid index to achieve the purpose of coarse-grained filtering of the trajectory data.In the second step,in each spatial grid subset,the start and end points of the trajectories are selected to perform clustering and partitioning operations to establish the Start-End-index;Then use the feature trajectory generator proposed in this thesis to calculate the feature trajectory points corresponding to each trajectory in the partition,and establish the Feature-point-index;These two indexes combine to form the Start-End-indexFeature-point-index to achieve the purpose of fine-grained filtering of trajectory data.Based on the above multi-level index structure,this thesis designs and implements a trajectory similarity search system.The system is implemented by improvement on DITA,which is the most representative open-source system in the field.Our system can provide threshold-based trajectory similarity search and Top-K trajectory similarity search functions.Since the bottom of the system uses the Spark big data processing platform,it can run on a single machine and distributed clusters.In this paper,a large number of experiments have been carried out in a single-machine environment and a distributed environment composed of three servers.Experimental results show that the multi-level index structure proposed in this thesis exhibits good performance in the selection of trajectory datasets.Compared with the DITA system,regardless of whether it is a single-machine environment or a distributed environment,the performance of our system in millions-level trajectory data similarity search has increased by about 20%.
Keywords/Search Tags:Trajectory Data, Trajectory Similarity Search, Feature Trajectory, Big Data Processing
PDF Full Text Request
Related items