Font Size: a A A

An Empirical Study Of Factors Affecting Residential Housing Situation Based On Census Data

Posted on:2017-04-01Degree:MasterType:Thesis
Country:ChinaCandidate:X PengFull Text:PDF
GTID:2309330485494526Subject:Business Administration
Abstract/Summary:PDF Full Text Request
In recent years, Housing issue has become a hot issue of social concern, it is necessary to explore the inherent law of the housing problem using basic data analysis and data mining modeling. In this paper, I try to establish a multiple linear regression model based on dummy variables and random forest model based on big data algorithm with basic data of Erqi District of Zhengzhou city in the sixth national census. And I try to explore the relationship between household as a unit of the per capita housing area and relevant statistical indicators, and to identify factors that affect the per capita housing area.Multiple linear regression analysis reflects the law that one thing’s change based on other things’ change. It can be used for quantitative analysis and classification analysis.And it is necessary to modeling regression model using dummy variables when independent variable is a qualitative variable. Random Forest is a machine learning model that can predict the impact of thousands of independent variables on the dependent variable. And Random Forest is insensitive to collinearity between independent variables,known as one of the best algorithm model.Through modeling analysis the per capita housing area on the multiple linear regression model with dummy variables and the random forest model, the following conclusions:(1) administrative division, the household should enrollment, housing origin,age, household and other indicators do a greater impact on the household as a unit of the per capita housing area.(2) In the case of more disaggregated indicators, Random Forest model is more convenient than multiple linear regression model, better fit, less errors.
Keywords/Search Tags:housing status, multiple linear regression, dummy variable, big data, Random Forest
PDF Full Text Request
Related items