With the rapid development of urbanization and social economy in China,residents’ travel activities have become increasingly complex and diverse in time and space,which puts forward higher requirements for urban traffic management and planning decisions.However,traditional resident travel surveys have many long-term problems,such as cumbersome investigation organization,data distortion due to subjective recall and too long data update cycle.It hinders the demand of traffic decision making for comprehensive grasp of urban traffic operation status at the head of data source.With the development of the new generation of mobile communication technology and the arrival of the big data era,the traffic activity information extraction technology based on cellular signaling data has been widely concerned by the industry.With the advantages of cellular signaling data,such as high sample coverage,relatively low acquisition cost and dynamic updating of data,it makes it possible for urban comprehensive traffic management mode to upgrade from "Empiricism" to "Data-driven".Especially in public health emergencies,it is expected to play an important role in virus tracing and transmission chain disruption.However,the existing travel characteristic extraction technologies based on signaling data still have some problems to be further solved,such as inadequate technical applicability,difficulty in accuracy evaluation and lack of sensitivity analysis of key technical parameters.This paper integrates the technology of traffic,communication and artificial intelligence.It constructs a closed-loop evaluation system from field collection,mining analysis to comprehensive evaluation based on cellular signaling data,and explores two technical routes for trip end extraction based on the improved clustering methods and the deep learning model.The integrated simulation platform is constructed aiming at analyzing the key technical parameters.The main research points and results are as follows:1)With the support of communication operators,the field travel experiment design and multi-source data synchronization collection under multi-factor scenario are carried out,collecting cellular signaling data as the research object,travel log data and mobile phone sensor data for the accuracy comparison evaluation.The scenario design of the travel experiment considers the key factors such as different city locations,different base station densities and different travel purposes,in order to cover different types of common individual travel characteristics.The travel experiment provides data support to solve the long-standing problem that it is difficult to evaluate the research results based on cellular signaling data.2)This paper proposes two technical routes for extracting trip ends.Firstly,aiming at the defect of fixed clustering radius in the clustering algorithms used in existing studies,the improved agglomerative hierarchical clustering(HAC)algorithm and density-based spatial clustering of applications with noise(DBSCAN)algorithm are proposed to extract trip end clusters,which can adaptively adjust clustering radii on the basis of base station density.Then for the unsolved optimization problem of the clustering radii under different base station densities,the fitness function with the minimum error of average weighted time as the objective is constructed.The genetic algorithm(GA)is used to solve the optimal clustering radius in each scenario.Secondly,drawing on the technical theory in the field of artificial intelligence,the characteristic features are selected from three aspects: motion characteristic,density characteristic and distance characteristic.Then a travel state classification method based on bi-directional long short-term memory network(Bi-LSTM)model is proposed.The empirical data is used to optimize the parameters and train the model by 5-fold cross validation for extracting trip end clusters.Whereafter,aiming at the problem that clusters in the same trip end oscillate,caused by communication disturbance,a new algorithm of correcting location oscillation,which uses a suspected oscillation sequence instead of a fixed-time window as a testing unit,is proposed to further optimize the trip end clusters.The algorithm solves the defect in the existing research,which is difficult to detect the middle cluster lasting too long in an oscillation sequence,so that the problem of multi-recognition of the trip ends is further optimized.The results on the basis of the empirical data show,the accuracy of the proposed clustering algorithms based on the mechanism of adaptively selecting clustering radii is 6-9%higher than the method proposed in the existing studies.This proves the effectiveness of the clustering process improvement proposed in this paper.Meanwhile,the trip end recognition method based on Bi-LSTM model is further superior to the improved clustering algorithms proposed above,which proves the theoretical superiority of Bi-LSTM as a deep learning model.However,in the actual complex technical application conditions,because of the high requirements on the integrity of the signaling data source and the hardware configuration of the computing force when training the Bi-LSTM model,the two technical routes proposed in this paper have their own advantages in theoretical advancement and engineering applicability.Finally,this paper uses the two proposed technical routes to recognize the information of the trip end and the travel time in Commuting,Home and other Non-commuting travel purposes respectively.Some technical evaluation indexes are counted,such as the identification proportion and the recognition errors of travel time and activity location,so that the more suitable application scenarios based on the two improved clustering algorithms and the BiLSTM model are analyzed.3)An integrated simulation platform of combining communication signal simulation and traffic simulation is constructed in this paper.With the existing traffic simulation software VISSIM,individual travel mode configuration is carried out to generate individual travel trajectory data consistent with the real travel experiment.According to the real mobile communication network layout and parameter configuration,the WINNER II path loss model in the signal simulation is used to judge the service base station of the simulation individual based on the criterion of maximum signal to noise ratio.The simulation framework breaks through the professional barriers of simulation technology in the field of transportation and communication.It can generate the signaling data under controllable system state parameters,and avoids the defects that the base station cell boundary is assumed to be fixed or hexagonal in some existing communication signal simulation models.It provides data support for subsequent comprehensive evaluation of the different system parameters and influencing factors.4)This paper comprehensively evaluates the impact of different communication frequency,base station density,communication disturbance and data type on trip end recognition using the empirical data and simulation data for the first time.The applicability of technical methods in the process of communication network evolution and communication environment change,as well as the sensitivity of the main technical parameters are studied through single factor analysis.The results show that when the communication frequency decreases from 4G high frequency positioning to simulated 2G low frequency positioning,the precision decline rate of each travel characteristic recognition gradually increases.Further improvement of communication frequency on the basis of 4G high frequency positioning has limited improvement on the recognition effect;When the base station density and communication disturbance,the two communication environment factors,gradually increases and decreases respectively,the overall identification effect of trip ends and travel time also gradually improves,but the multi-recognition rate and activity location recognition fluctuate to a certain extent;In addition,this paper verifies the applicability of the technical method in the simulated measurement report(MR)positioning data that can be collected on a large scale by the future communication network,and makes a comparison with the signaling data and GPS trajectory data.It further show that the positioning frequency of the current signaling data can basically meet the needs of travel time and trip end recognition,while improving the positioning accuracy of mobile phone data can significantly improve the recognition effect. |