With the rapid economic and political development in China,the pace of urbanization development is getting faster and faster,and the land resources in cities are increasing day by day.How to use land scientifically and reasonably has become a hot topic for scholars to study.Scholars at home and abroad are committed to exploring different models and methods to improve the accuracy of simulation.However,while the overall accuracy is improved,the data imbalance leads to the relatively low accuracy of a few types of samples.How to improve the simulation accuracy of a few types of samples has become an important problem.Therefore,this paper takes Chongqing 21 st District as the research area.Firstly,it optimizes the data sampling method and algorithm to improve the overall simulation accuracy and the accuracy of a few samples of the model.The main research contents and conclusions of this paper are as follows:1.In this experiment,we first use the synthetic minority over sampling technique(Synthetic Minority Over-sampling Technique,SMOTE)to balance the data sets to different orders of magnitude to find the best data set.Different equilibrium results are introduced into the model of MLP coupled with cellular automata(Cellular Automata,CA)one by one,and the results are compared.By comparing the overall accuracy and classification accuracy of each result,the conclusion is drawn: equalization to different levels has a certain degree of impact on the overall accuracy,and the improvement of the accuracy of a few samples is more obvious.2.Next,this experiment will use the probability weighted synthetic oversampling technique(Proximity Weighted Synthetic Oversampling Technique,ProWSyn)to balance the 2000 data set to the best balanced data set level and import it into the model to get the 2010 forecast results and compare with the actual results to see the forecast accuracy.In order to verify the effectiveness of this sampling technique,six different equalization algorithms are used to compare with ProWSyn.Finally,it is concluded that the prowsyn algorithm selected in this paper has the best effect.3.Integrated learning is one of the important means to solve the data imbalance.In this paper,the ProWSyn and Adaboost-CA coupling models were first constructed,and then compared with Logistics-CA,GBDT-CA and other models,the results showed that the model used in this experiment had a better effect than other models in terms of overall accuracy and accuracy of a few samples.4.Three development scenarios are designed to simulate the land use distribution of Chongqing 21 District in 2030,which are natural development,development according to policy and development according to ecological protection. |