Font Size: a A A

Research And Implementation Of Trajectory Big Data Query System Based On HBase

Posted on:2022-11-22Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y LiFull Text:PDF
GTID:2518306788456834Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
With the wide application of positioning technology,massive moving objects have generated massive moving trajectory data.These trajectory data contain rich knowledge and laws,which need to be used efficiently.Trajectory big data not only has the characteristics of big data,but also has the unique characteristics of trajectory data,such as spatio-temporal seriality,uneven distribution in temporal and spatial dimension,etc.The construction of trajectory big data query system has many difficulties,such as cleaning the noise points of trajectory big data under the framework of big data,indexing and querying of trajectory big data,etc.In order to query the trajectory data accurately and efficiently,this paper designs and implements an HBase-based trajectory big data query system with the help of the ability of hadoop platform to process and store big data,and experimentally verifies it based on the massive real trajectory data accumulated in the laboratory.Trajectory big data query system is the basis and support of trajectory big data application,this paper takes the construction of trajectory big data query system as the starting point,focusing on several important aspects in the system construction process,including trajectory filtering,trajectory indexing and trajectory query.The specific work is carried out around the following three aspects:(1)Because of data quality problems of the original trajectory data,a trajectory data cleaning process is proposed.In this cleaning process,the original trajectory data is divided into two types for processing according to the different types of abnormality.The first type is to filter the trajectory data according to the spatio-temporal boundary of the imported data,trajectory data not in the spatiotemporal collection interval will be screened.The second type is for the drift point problem in the trajectory.By setting the speed and acceleration thresholds,the longest sub-trajectory that satisfies the threshold conditions is used as the reference trajectory,and the trajectory data is completed by expanding to both ends by secondary screening.For the object-oriented-time range query and time-space range query of trajectory big data,the corresponding index structure and storage model are designed respectively,and the query method under the distributed architecture is optimized.Especially for the query requirements of time-space range query,an innovative index method based on historical data pre-partition is proposed.By building an auxiliary secondary index structure,the storage of trajectory big data is optimized,thereby improving query efficiency.Based on the index structure,two query methods of spatial redundancy and spatial partitioning are proposed.Through experimental verification,the index and query method proposed in this paper can effectively improve the spatiotemporal query performance of trajectory big data with uneven distribution characteristics,while ensuring the accuracy of query results,and minimizing the number of generated subqueries.(3)The trajectory big data query system is designed and implemented.The system is developed based on Spring Boot,Using components such as Map Reduce,HBase,and Flume in the Hadoop ecosystem,to realize the full life cycle management of trajectory data from cleaning,indexing to query result visualization,which can provide guarantee for subsequent analysis and mining.For the imported massive trajectory data,both object-time range query and time-space range query can get good query time response.
Keywords/Search Tags:Trajectory data, trajectory data index, spatiotemporal range query, Geohash encoding pre-partitioning
PDF Full Text Request
Related items