Font Size: a A A

Random Forest Algorithm For Soil Heavy Metal Pollution Risk Assessment

Posted on:2022-01-10Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y YuFull Text:PDF
GTID:2491306548966779Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the advancement of industrialization and urbanization,the area of soil heavy metal pollution in peri-urban farmland is expanding.Heavy metal contamination of soil can lead to the deterioration of the quality of agricultural products grown in the soil and also increases the risk of disease for nearby residents.There is an urgent need to assess the risk of heavy metal contamination in peri-urban farmland soils.At present,the evaluation of the risk of heavy metal contamination in soil is mostly done by the estimation of experts in the field.The evaluation results of this method are subject to subjective factors and inefficient.In order to make the evaluation of soil heavy metal pollution risk more efficient,this paper adopts machine learning technology to realize the intelligent evaluation of pollution risk.That is,based on the Random Forest(RF)algorithm,the algorithm is improved according to the imbalanced distribution of soil heavy metal pollution data,and the improved Random Forest algorithm is applied to soil data from some suburban agricultural areas in Wuhan City to verify its effectiveness.The main work and innovations of this paper are as follows.(1)An Adaptive Bagging Weighted Random Forest(ABW-RF)algorithm based on weighted voting is proposed to address the shortcomings of the random forest algorithm in dealing with the classification of class imbalanced data.The algorithm ensures that each decision tree can learn a certain number of minority class samples by improving the Bagging process and voting mechanism of the random forest,and introduces a weighting factor in the aggregated voting process to consider the possible effects of the classification performance of different decision trees on the results.(2)The ABW-RF algorithm was applied to complete the intelligent evaluation of soil heavy metal pollution risk.Firstly,a comprehensive evaluation of contamination risk was performed on soil data from suburban agricultural areas of Wuhan city based on national standards to determine the target value of each sample,and then the ABW-RF algorithm was applied to conduct relevant experiments on some soil data.Comparing the risk evaluation results of the ABW-RF algorithm under different weighting factor calculation indexes,it was found that the ABW-RF(Recall)algorithm with recall rate as the calculation index of weighting factor performed the best,and its classification recall rate reached over 0.7 and classification precision rate reached 1.0.(3)The Bayesian optimization algorithm is applied to the hyperparameter tuning of the ABW-RF algorithm.The effects of different hyperparameter settings on the classification performance of the ABW-RF algorithm on soil data were analyzed experimentally.Based on the results of the analysis,a hyperparameter search space was designed,and a relatively better combination of hyperparameters was searched by the Bayesian optimization algorithm with G-mean as the objective function.A comprehensive analysis of the experimental results of the ABW-RF algorithm proposed in this paper shows that the classification performance of the ABW-RF algorithm on regular data sets has certain advantages over the commonly used machine learning classification algorithms.On the unbalanced dataset,the ABW-RF algorithm performs more consistently and with higher accuracy.
Keywords/Search Tags:random forest, weighted voting method, adaptive Bagging, imbalanced data, Bayesian optimization
PDF Full Text Request
Related items