Font Size: a A A

City Hot-Spots Query And Optimization

Posted on:2020-11-08Degree:MasterType:Thesis
Country:ChinaCandidate:J X KangFull Text:PDF
GTID:2392330596986219Subject:Computer technology
Abstract/Summary:PDF Full Text Request
City Hot-Spots is a spatial-temporal area with frequent trips and larger traffic flow.It has a large number of applications in the areas of public services,such as urban infrastructure construction,transportation planning,selection of shop location,and crime prevention.The current methods for Hot-Spots detection usually adopt the Getis-Ord G_i~*statistical method to divide the trajectories into spatial-temporal areas and calculate the hot value of the areas covered by all the trajectory data as city Hot-Spots.With the expansion of practical applications,people have higher requirements for city Hot-Spots detection,and hope to customize city Hot-Spots detection according to different needs.Hot-Spots detection for different requirements uses different data,but the existing Hot-Spots detection methods use one-time calculation of massive historical data,so that the Hot-Spots obtained often cannot meet the actual needs.When the trajectory data is updated or the Hot-Spots detection of different requirements is performed,the Hot-Spots have to be recalculated by the detection algorithm,which occupies high memory and spend long calculation time.Due to the large number of accumulated trajectories and complicated calculations,the optimization of existing detection algorithms focuses on how to deal with massive data,for example,using distributed computing.From the currently known literature,there is no city Hot-Spots specifically for different requirements.In view of the above problems,this paper attempts to study the parameterized‘city Hot-Spots query'and set five types of query parameters(geographic range,date range,Hot-Spots granularity,time organization and the number of Hot-Spots)that meet the actual needs,and implements the city Hot-Spots multi-parameter query through appropriate data organization.For different query parameters,the city Hot-Spots query needs to process different data.We propose a small-grained 3D index grid to reorganize the trajectory data,so that the data to be processed can be quickly extracted.For large-scale data query,the common query algorithm takes a long time.We use the strategy of sampling&filtering to reduce the time complexity of the algorithm to O(1)and improve the query performance under large-scale data.For the detection algorithm,the optimized Getis-Ord G_i~*statistical method is adopted.The optimization method includes using the two jobs to solves the data incline and the RDD element regroup to solve the excessive shuffle amount.The query experiment is carried out in the Spark cluster with the New York City taxi track dataset.The results show that our small-grained 3D index grid method and storage strategy can implement the specified parameters query and greatly reduce the query response time.The sampling&filtering strategy can effectively solve the problem of long time query response caused by the excessive data volume.In addition,the optimization method in the detection algorithm can significantly improve the detection efficiency of the distributed algorithm.
Keywords/Search Tags:City Hot-Spots query, Big data, Data manage, Spark
PDF Full Text Request
Related items