| With the popularization of mobile positioning equipment and the improvement of network communication facilities,massive trajectory data can be collected in real-time.The spatiotemporal trajectory data can be used for trajectory knowledge mining and information acquisition in advance,such as driving anomaly detection,traffic congestion monitoring,traffic prediction,and crowd gathering behavior monitoring during the epidemic.Trajectory big data has the 4V characteristics of traditional big data,and its information value will decay over time.For example,instant result information release for traffic conditions will be more effective to avoid road congestion.For this reason,the real-time processing of trajectory data is an important focus of this paper.Trajectory clustering is the basis for the above urban applications,which can detect group aggregation activities.However,the algorithms for real-time stream clustering based on spatiotemporal trajectory data still have problems such as pseudo-real-time,high latency,unbalanced load,and inability to perceive between distributed nodes.This paper focuses on real-time trajectory streaming clustering based on the distributed computing platform and designs a real-time clustering method that can produce accurate results quickly and efficiently to solve the above problems.Real-time distributed trajectory stream clustering is an emerging area of focus.Some works have conducted tentative research,but they have not effectively combined real-time streaming processing and distributed computing.The distributed parallel processing architecture has the characteristics of high throughput and low latency.It is currently a general platform for processing massive data.However,the problem of Share-nothing under the distributed parallel architecture is bound to destroy the integrity semantics of the trajectory.Specifically,the trajectory is the time-space serial data in the time and space dimensions.Distributing data across distributed nodes breaks the continuous state of the trajectory.In addition,the real-time requirements of trajectory data analysis are getting higher and higher with the changes in user experience and needs.In order to achieve the purpose of solving the above problems,a real-time distributed trajectory streaming clustering algorithm that adapts the trajectory data structure to the distributed parallel architecture needs to be proposed.To solve the problem of incompatibility between the distributed parallel architecture and the trajectory structure,it is necessary to overcome the pressure of massive data on the system and solve the problem of the calculation method of trajectory data in a distributed environment.Therefore,this paper reduces the data load from the trajectory compression method,studies trajectory clustering from the calculation method between trajectories,and builds a real-time application system based on the two studies.Finally,this paper establishes a full-life-cycle technical system centered on trajectory clustering.Specifically,it includes:(1)A lightweight real-time trajectory incremental compression algorithm for trajectory clustering is designed to achieve real-time compression,reduce the amount of data,and reduce subsequent processing workload through incremental computing.(2)A lightweight realtime distributed trajectory streaming data clustering algorithm is proposed based on the trajectory compression representation.We generate the abstract trajectory structure by extending the trajectory incremental compression representation algorithm proposed in this paper.The trajectories are calculated and clustered based on the abstract structure.This solution reduces the amount of data processing and solves the problem of Share-nothing and unbalanced load in a distributed environment.(3)Based on the proposed trajectory compression and trajectory clustering,a complete real-time trajectory streaming data monitoring system is formed to monitor urban traffic aggregation behavior.This paper studies a real-time distributed trajectory streaming clustering method.This method utilizes the high throughput of the distributed parallel computing architecture,the low latency of stream computing,and the lightweight operation brought by trajectory compression to achieve the effect of real-time trajectory big data clustering.The related research results of real-time cluster monitoring for massive moving objects can be used to monitor a series of urban traffic and promote the traffic construction of the intelligent city. |