Commuting travel is an important part of determining the normal operation of a city,and the traffic congestion is obvious during commuting.Building a travel choice behavior model to analyze the decision-making mechanism of commuting travel mode choice is the basic work of alleviating traffic congestion and traffic demand management.Data quality is the key factor affecting the prediction accuracy of the model.The data source of the traditional travel mode choice behavior model is mostly a questionnaire survey.The questionnaire survey data can collect rich traffic information,but it is highly subjective and difficult to sample randomly.The mobile signaling data can not obtain personal attributes,but it covers a wide range and collects objective of traffic information.It can be used as benchmark data to evaluate and improve the model’s prediction accuracy.The two types of data can not correspond one by one due to privacy protection.This paper takes the population classification as the bridge to combines the advantages of questionnaire data and mobile signaling data.The questionnaire data is used to construct the commuter travel mode choice model of different populations.The model prediction results of different populations are weighted by the classification proportion of the mobile phone population.And the sharing rate of mobile phone signaling data is taken as the benchmark data to compare and evaluate the model.Firstly,the commuter travel behavior of the target population is analyzed by using mobile signaling data.Rule algorithm is used to extract the middle and long-distance commuters living in the railway corridor.Based on the spatiotemporal density clustering algorithm,the travel dwell point is identified and the commuter travel period is extracted;The random forest algorithm is used to identify commuter travel modes.The analysis results of commuter travel characteristics are as follows: the average travel time is 38 minutes,the average travel frequency is 11 times per week,and the travel peak is 7.00-10.00 in the morning and 17.00-21.00 in the evening;The proportion of travel modes: 53% by car,30% by rail and 17% by bus.Secondly,based on k-means++ clustering algorithm,the population is classified taking the travel frequency of different modes as the classification variable and cosine similarity as the distance measure.According to the classification results of mobile phone holders and questionnaire survey population,class 1 population accounts for a relatively high proportion of car travel,which is car preference population.Class 2 population accounts for a relatively high proportion of bus travel,which is bus preference population.And class 3 population accounts for a relatively high proportion of railway travel,which is railway preference population.For residents with mobile phones,the proportion of groups 1,2,and 3 is 55%,14%,and 31% respectively;The proportion of 1,2,and 3 groups of residents surveyed in the questionnaire was 19.3%,40.1%,and 40.6% respectively.Then,the nested logit model is used to build a commuter travel mode selection behavior model based on individual heterogeneity.For the three groups of people,the travel mode choice behavior models are constructed respectively and the influencing factors of travel mode choice are analyzed.The model results show that for the people with bus preference and railway preference,the travel cost,out of vehicle time,and in-vehicle time is significantly affected.And the absolute value of the cost parameter is large which means the two groups of people are more sensitive to the cost.For the people who prefer cars,the time outside and in the car has a significant impact,but the travel cost has no significant impact.This kind of person is more sensitive to time.Finally,the model results are compared and evaluated.The model prediction results of different populations are weighted by the classification proportion of the mobile phone population.Taking the travel mode proportion obtained from mobile phone signaling data processing as the benchmark data,the mixed logit model,the nested logit model after original population classification,and the nested logit model integrating mobile phone signaling data are quantitatively compared.The results show that the absolute value of the prediction error of the nested logit model integrating mobile phone signaling data is 2.58% railway 1.25% and car 4.10%,less than the other two models.It is proved that the nested logit model fused with mobile phone signaling data has higher prediction accuracy. |