Font Size: a A A

High-resolution Population Mapping Using A Random Forest Model

Posted on:2020-07-14Degree:MasterType:Thesis
Country:ChinaCandidate:G QiuFull Text:PDF
GTID:2370330596971410Subject:Cartography and Geographic Information System
Abstract/Summary:PDF Full Text Request
Population data is one of the straightest indicators of human activity.The traditional population data is mainly derived from census connection every ten years at different levels of administrative units.It has low temporal resolution and does not support spatial analysis.With the development of remote sensing and geographic information science,the spatialization of population data,especially high-resolution gridded population data,is playing an increasingly important role in understanding and responding to numerous social,economic and environmental issues.At present,the spatial resolution of mainstream gridded population product are 1KM which cannot meet the growing needs of applications such as fine urban management.In addition,the emerging geographic big data such as Point Of Interest(POI),building footprint provide big opportunities to prompt better accuracy of gridded population product.This study focuses on the problems of both low spatial resolution and the incomplete use of supplementary geographic big data of mainstream population gridded product.This paper used a random forest algorithm,on the basis of remote sensing(satellite imagery)and social sensing ancillary data(point-of-interest,building footprint),to disaggregate census population data onto 100 M grid cells in Zhengzhou,China.The spatialized 100 M population grid data was then compared with the census data by box plots to explore the causes of the outliers.Based on the outliers,we rebuilt our model using best parameters and ancillary data.(1)According to the analysis of this study,the population of Zhengzhou is generally distributed radially to the center.The population is mainly concentrated in the eastern part of the Zhongyuan District,the northeast of the Erqi District,and the western part of the Jinshui District.Among them,the streets with high population density are the streets of Ludong Village,the streets of Jianshe Road,with an average population density of more than 200 people per 100 meters.The population density of Guzhen Town is low and the population is evenly distributed.The average population per 100 meters is only 6.004 persons.(2)After comparing the results of this study with other mainstream population grid data in Zhengzhou city,it is found that the overall spatialization accuracy of the population data of this study is better.Among them,the correlation between the modeling characteristics of random forest model and population density: R2 = 91.28;The root mean square error(RMSE)of the spatialized population and the objective population on the township/street administrative unit = 25783.59,which exceeds the Worldpop(RMSE = 31543.66),China Kilometer grid population distribution data set(RMSE = 35800.90),and Gridded Population of the World(RMSE = 33791.59).(3)According to the mean precision reduction method and the Boruta method,this study studies the variable importance of the auxiliary data of random forest modeling,and finds that in the more complex urban environment,the interest point data,especially the location of the residential area.Points of interest,parking lot location points,and bank location points have a large impact on the spatial distribution of population.Conversely,relatively large variables such as temperature and precipitation contribute little to the spatialization modeling of small-area population data.In the rapid population modeling of small cities area,the high importance score variables in this study can be used in the future.
Keywords/Search Tags:Population distribution, Random forest regression, Zhengzhou city, Remote sensing, Point of interest, Building footprint
PDF Full Text Request
Related items