Font Size: a A A

Recognition Of Urban Functional Areas Based On Internet Access Records

Posted on:2021-03-04Degree:MasterType:Thesis
Country:ChinaCandidate:T LiuFull Text:PDF
GTID:2428330605969613Subject:Control engineering
Abstract/Summary:PDF Full Text Request
Currently,the traditional methods for the identification of urban functional areas mainly include:on-site exploration and acquisition,annotation of aerial or remote sensing images,and point of information(POI)data collection.All of the above methods invest the massive manpower physical resource.At the same time,due to changes within the city,these methods,which often take several years to complete the data collection work,cannot meet the requirements of timeliness.Focusing on the above problems,this paper will complete the identification of urban functional areas based on the user's Internet access records within a certain period of time.We dig out the features of users' Internet access records from the time dimension and the user dimension,and perform a series of cleaning and preprocessing work on the extracted features according to applicable scenes of different multi-classification algorithms,so that they can do their best in the corresponding algorithm.In terms of the time dimension,because there are user access records of each region in the data set for half a year,considering the time sequence,I extract the features of the number of visitors at different times in each region,arrange them in chronological order,and put into long short-term memory(Long Short-Term Memory,LSTM)recurrent neural networks complete exploration of data timing.In addition to the time sequence,we should also consider:fluctuation of user's Internet access on legal holidays and weekends;the difference between visitors numbers and visitor sessions;the impact of statistical features such as variance,range,and percentile.Based on the above features,this paper uses random forest algorithm for feature mining.From the user dimension:on the one hand,depending on the mobility of people,each functional area is regarded as a point,and the same user between areas is regarded as a line connecting two points,and then construct a connectivity graph between areas;on the other hand,considering the similarity of travel rules of users belonging to the same area,digging out the features of each user's network usage rule in each functional area,and then making various statistics on the features of user's network usage of in this area,so as to achieve the purpose of featuring the functional area with users.After user feature extraction,I use the Light Gradient Boosting Machine(Light Gradient Boosting Machine,Light GBM)algorithm to dig out features.In terms of algorithm efficiency,because the extracted feature dimensions are too high,and there are couplings between features,so Principal Components Analysis(PCA)is used to reduce the dimensionality of the features to speed up the model training speed;in terms of classification accuracy,in order to maximize the advantages of each sub-model,we introduce ensemble learning methods.Through experimental comparison,it is found that fusion methods such as blending can greatly improve the final classification accuracy of the model.Through the research in this paper,we can achieve the accuracy rate of more than 87%of the multi-classification task of the urban functional area.In addition,by visualizing the extracted time and user dimension features can also analyze the heat changes in various regions and people's travel laws.These have important guiding significance for the city's resource allocation,regional planning,and transportation.
Keywords/Search Tags:Multiclassification, Feature_Engineering, LSTM, Random Forest, LightGBM, Ensemble Learning Algorithm
PDF Full Text Request
Related items