Font Size: a A A

Research On Storage And Query Of Spatio-temporal Trajectory Data Based On Phoenix Platform

Posted on:2021-10-03Degree:MasterType:Thesis
Country:ChinaCandidate:J W HanFull Text:PDF
GTID:2518306050464654Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information and communications technology and the widespread popularization of portable mobile devices,a large amount of user data has been generated.Spatio-temporal trajectory data belongs to a type of data that has the value of practical application.Organizing and using effectively this kind of data is an important issue that is also a common concern of academics and industry.In the scenario of massive data storage and processing,traditional relational-database-based solution has limited expansion ability and faces difficulties to meet the requirements well,and it is not suitable for processing large-scale datasets.However,in industry recently,the distributed computing platform like Hadoop offers new ideas for solving this problem.With the development of these years,the Hadoop ecosystem has many components already.Among them,the non-relational database HBase and Phoenix and other components can work together well.They can meet the requirement of processing massive data,and provide SQL language support.However,they do not provide direct support for the organization and management of spatio-temporal trajectory data.In view of these problems,according to the features of the distributed platform,this thesis studies the content of spatio-temporal trajectory data storage and query,designs and implements a prototype system for trajectory data storage and query based on Phoenix named Traj Phoenix.This prototype system supports real-time data import.It also supports offline batch import of data.It implements optimized spatial and temporal range query,time period KNN query,the nearest neighbor trajectory query,etc.The prototype system is userfriendly and supports SQL statements.The main work of this thesis is as follows:(1)This thesis studies the existing spatio-temporal trajectory data management schemes based on distributed platforms.Based on the characteristics of the data and Phoenix,a data storage model is designed and an appropriate indexing method is selected,and they provide support for query.(2)The prototype system uses ST-Code as the spatio-temporal index structure,analyzes the encoding characteristics of ST-Code and the problems when querying.Two methods are proposed to optimize query process,one is divide-merge strategy,and the other is a strategy based on data distribution statistics.(3)According to the two optimization strategies proposed,this thesis realizes spatialtemporal range query,time-period KNN query and the nearest neighbor trajectory query,and some query optimization algorithms are proposed based on the data statistics and the Phoenix UDF.(4)In the realization of the prototype system,considering high availability and easy use,system supports some functions such as data import tools and serialization tools,etc.At the end of this thesis,experiments choose a real spatio-temporal trajectory dataset,and the prototype system is deployed in an experimental environment.Experiments include the performance of data import and query.In addition,they are compared with similar solutions.The experimental results show that the optimized ST-Code based query methods proposed are more efficient than similar solutions to some degree.
Keywords/Search Tags:spatio-temporal trajectory data, spatio-temporal index, Phoenix, ST-Code
PDF Full Text Request
Related items