With the rapid development of China’s economy and the continuous promotion of urbanization,the real estate market in China is constantly changing.The real estate market in China has been growing rapidly,and price forecasting has become an important issue for people and decision-makers.This paper mainly forecasts the price of second-hand houses in Dalian by improving the extreme random forest model.The innovation of this paper is to propose an extreme random forest(GSR-ERF)model based on chi square prior mixed feature selection algorithm,and propose a Hyperopt hyperparameter optimization method to optimize the parameters of GSR-ERF model.The research work of this article mainly includes the following aspects:Firstly,the dataset of this article uses Python language to crawl the second-hand housing data of the real estate supermarket in Dalian in 2022.The crawled data is preprocessed and cleaned,and the missing values are filled and standardized after data cleaning.Finally,the second-hand housing data is analyzed and described.Secondly,this paper proposes an extreme random forest model(GSR-ERF model)based on a hybrid feature selection algorithm based on a priori chi square principle.The algorithm obtains the optimal feature subset based on genetic algorithm,simulated annealing algorithm,and cross validation recursive elimination algorithm,and obtains the chi square scores of these three models on the priori test set through the priori test set.Then,by combining the chi square proportion of the three models with their corresponding optimal feature subset,the final optimal feature subset is obtained,and the extreme random forest model is trained through the obtained optimal feature subset.This paper proposes hyperparameter optimization of GSR-ERF model based on Hyperopt hyperparameter optimization technology.The core of Hyperopt hyperparameter optimization method is an optimizer based on random search and Bayesian optimization algorithm.It constantly tries different parameter combinations to find the optimal model parameter configuration to improve the accuracy and fitness of GSR-ERF model.Finally,this article uses the GSR-ERF model to predict second-hand housing prices.Through experimental comparison,it is found that the R~2 of the proposed GSR-ERF model for predicting second-hand housing prices in Dalian has reached 0.90,and the GSR-ERF model proposed in this article has the best prediction effect,Apply the GSR-ERF model proposed in this article to the second-hand housing price prediction system. |