Font Size: a A A

Research On Trajectory Frequent Pattern Mining Algorithm Based On Spark

Posted on:2022-03-14Degree:MasterType:Thesis
Country:ChinaCandidate:L LiFull Text:PDF
GTID:2518306524452324Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Mining patterns from spatiotemporal data has many important applications in human mobility understanding,smart transportation,urban planning and ecological studies.An important problem of information city construction is how to improve the efficiency of mining frequent patterns which can be used for location prediction and location-based services(LBS)of massive trajectory data sets.Traditional trajectory sequence pattern mining algorithms generally have a challenging problem since the mining may have to generate or examine a explosive number of intermediate subsequences,which seriously affects the constriction speed and mining efficiency of the algorithm.Moreover,spatiotemporal data has become widely available nowadays with the fast development of positioning technology,and the data scale is getting larger and larger.Traditional trajectory sequence pattern mining algorithms suffer from problems including huge memory cost,low processing speed,and inadequate hard disk space.To solve these problems,this paper proposes a frequent trajectory pattern mining algorithm based on Spark.The research contents are as follows:(1)Due to uncertain personal trajectory and non-explicit trajectory items,the existing traditional sequence mining algorithms can not be used directly.Firstly,a grouping and partitioning technique is used to abstract the original trajectory data and convert them into a common time series.This paper proposes a framework for trajectory pattern mining and a distribute trajectory frequent pattern mining algorithm based on prefix pruning.(2)In order to avoid generating redundant trajectory pattern,designed the path adjacency pruning method and algorithm to prune effectively.(3)Aiming at the problem of mining efficiency brought by large-scale trajectory data,this paper design and implement algorithm based on distributed framework Spark which has the cluster memory computing,and we also improved the load balancing of distributed platform.(4)Finally,trajectory abstraction,pruning strategy and distributed algorithm experiments in common data sets shows that,the algorithm in this paper can effectively extract frequent trajectory patterns,especially deal with the massive trajectory data.Compared with common trajectory pattern mining algorithms,the algorithm not only improves the overall performance,but also has good scalability.
Keywords/Search Tags:data mining, frequent trajectory pattern, Spark, trajectory abstraction
PDF Full Text Request
Related items