Font Size: a A A

Retrieval Of Heavy Metal Content In Soil Based On Random Forest Regression Model Based On Multispectral Remote Sensing

Posted on:2021-06-30Degree:MasterType:Thesis
Country:ChinaCandidate:K FangFull Text:PDF
GTID:2492306470984379Subject:Surveying and Mapping project
Abstract/Summary:
With the deepening of our country’s industrialization process and the gradual improvement of environmental awareness,people are paying more and more attention to the problem of soil heavy metal pollution,and the research on the degree of soil heavy metal pollution in an area is paying more and more attention.Common data sources used by current scholars for research are simulated spectrum and hyperspectral remote sensing,and commonly used inversion models are multiple linear regression models,least square models,and machine learning algorithm models such as decision trees and neural networks.In order to explore a new method of soil heavy metal inversion research,this study proposes a random forest regression model based on multispectral remote sensing,which combines the advantages of multispectral remote sensing data that are easy to obtain,simple to process,wide coverage,and high spatial resolution It also combines the advantages of random forest regression model suitable for processing high-dimensional sparse data and strong generalization ability.In this study,the Daxigou iron mine in Zhashui County,Shaanxi Province was used as the research area.The data sources used Landsat8 remote sensing images in the study area,ASTER GDEM,arsenic(As),copper(Cu),and lead(Pb)extracted from soil samples.Heavy metal content.Through the processing of remote sensing images and DEM data throughout the year,the qualified images are selected and the 6 spectral bands,8 derived spectral indexes and 3 topographic factors corresponding to the pixels of each sampling point are extracted.By correlation analysis,complex collinearity The analysis determined the modeling month and modeling factor of each heavy metal element.The K-fold cross-validation method was used to establish multiple linear regression models,CART regression tree models and random forest regression models of heavy metal elements and compare the accuracy evaluation indicators root mean square error(RMSE),average absolute percent error(MAPE)and relative analysis error(RPD).Finally,using the optimal random forest regression model of the three heavy metal elements to invert the content of the three elements in the study area,statistically analyze the results and make a spatial distribution map to obtain the heavy metal pollution of the soil in the Daxigou mining area.According to the experimental results:(1)In this study,the random forest regression model is better than the multiple linear regression model and CART regression tree model in terms of stability,accuracy and reliability.(2)The K-fold cross-validation method can make full use of sample data,try different training sets and test sets,and obtain the optimal inversion model,which is better than the traditional sample division method when the number of samples is scarce.(3)Both the arsenic element and the lead element can directly determine the optimal random forest regression model with high accuracy and reliability;while the copper element also needs to compare the size and stability of the error rate outside the bag to determine the optimal Random forest regression model.(4)The maximum content of the three heavy metal elements is concentrated in the central mining area and the valley.According to field investigations,mining operations in the central mining area have been carried out for a long time.The elevation of the valley is lower than that of the mining area,and the settlements,farmland,rivers and roads for transporting ore are concentrated.Therefore,the inversion results are basically consistent with the actual situation,proving that the random forest regression model based on multispectral remote sensing can be used for soil heavy metal inversion.
Keywords/Search Tags:soil heavy metals, multispectral remote sensing, K-fold cross validation, random forest regression model
Related items