Font Size: a A A

Distributed Mining Of Universal Companion Pattern On Large-Scale Trajectory Data

Posted on:2022-02-21Degree:MasterType:Thesis
Country:ChinaCandidate:S J LiuFull Text:PDF
GTID:2518306554471144Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the wide application of positioning equipment,the quantity of trajectory data increases rapidly.Universal companion pattern mining focuses on the discovery of highsimilarity behavior paths of moving objects in the temporal and spatial dimensions.It is significant to designing efficient and accurate universal companion pattern mining methods based on large-scale trajectory data and it is also a considerable challenge for discovering user preferences and building new business models.On the one hand,massive and growing trajectory data requires universal companion pattern mining to have good scalability,which can be few applicated in single machine mining framework.On the other hand,in the existing distributed mining framework of universal companion pattern,there are less attention on the quality of data input and the processing of a large number of loose connections in trajectory data.Therefore,it is valuable to improve the discovery capability and performance of the universal companion pattern.In this paper,the above two issues are studied in depth in the distributed clustering and universal companion pattern mining of large-scale trajectory data.First,in the data preprocessing stage,the density clustering algorithm DBSCANCD and the time-dependent clustering balance algorithm TCB,which integrate motion direction,were proposed to provide high-quality data input for the distributed mining framework of universal companion pattern,and improved the ability of distributed mining framework to discover universal companion pattern.The DBSCANCD algorithm handles the clustering problem under the same snapshot of all trajectory data,and performs density clustering on all trajectory points in the same snapshot under the premise of considering the direction of movement between objects.Compared with the density clustering algorithm that generally uses Euclidean distance in the existing mining framework,the DBSCANCD algorithm can provide higher quality input for universal companion pattern mining.The TCB algorithm accepts the output of the trajectory data processed by the DBSCANCD algorithm as input,and uses the idea of greedy strategy to solve the problem that the DBSCANCD algorithm acting on the snapshot level cannot reasonably divide the cluster boundary points from the complete trajectory,and further improves the quality of the data input to the universal companion pattern mining algorithm.Considerable experiments indicate that the combined use of DBSCANCD and TCB improves the ability of distributed mining framework to discover universal companion pattern.Second,in the universal companion pattern mining stage,the G segment pruning and repartitioning algorithm GSPR and the segmented apriori enumerator algorithm SAE were designed to effectively deal with the loose connection phenomenon which existed in major trajectory data,and improved the discovery ability and performance of distributed mining framework for universal companion pattern.The GSPR algorithm processes the trajectory clusters that form a clustering phenomenon with each trajectory,and uses a custom parameter G to segment each loosely connected trajectory in the trajectory cluster,which provides an effective solution for the loose connection in the trajectory.Therefore,the GSPR algorithm guarantees the discovery ability of universal companion pattern mining.The SAE algorithm accepts the output of the GSPR algorithm as input.In a distributed environment,the SAE algorithm maximizes the hardware performance of the cluster by introducing multithreading.By using the forward closure,the SAE algorithm can check whether there is a maximum universal companion pattern that meets the requirements in the current state each time,and if it exists,it can output the result in advance and terminate the current thread.Therefore,the SAE algorithm guarantees the performance of universal companion pattern mining.The experimental results indicate that the combined use of GSPR and SAE improves the discovery ability and performance of distributed mining framework for universal companion pattern.Last,based on the four algorithms of DBSCANCD,TCB,GSPR and SAE,on the basis of the distributed computing platform Spark,the distributed mining framework DMFUCP of universal companion pattern was designed,which made full use of the advantages of Spark in memory computing.The DMFUCP framework provides discovery capabilities and performance superior to existing frameworks in the task of mining universal companion pattern.Considerable experiments have proved that,compared with the existing universal companion pattern mining framework,DMFUCP has better general adjoint pattern discovery capabilities while reducing the time consumption of mining each group of universal companion patterns by 20%-40%.
Keywords/Search Tags:distributed mining framework, loose connections, clustering balance, G pruning repartition, segmented enumeration
PDF Full Text Request
Related items