| The traditional soil survey is largely a manual based process and is conducted over the course of several years using a combination of field reconnaissance and airphoto interpretation techniques to determine the spatial distribution of different types of soil by polygon-based soil survey.Several issues affect the reliability and usefulness of traditional soil survey process and its products.First,drawing scale decide the minimal area of map unit,the bigger of the drawing scale,the smaller area of the map unit that can be expressed.So,the spatial variation of soils within polygons is not captured and small bodies of soil are ignored.This results in generalization in both the spatial and parameter domains.Second,manual delineation of soil polygons will ignore the spatial graded properties,sudden changes at the polygon boundary result the sudden changes in the soil properties.Also,manual delineation of soil polygons is a very tedious,time-consuming and error prone process.Therefore,it is essential to retrieve the high-accurate soil map from historical resources and data in Digital Soil Mapping.The proposed method was applied in Nieshui river basin in Huajiahe town,Hongan County,Huanggang city,Hubei province.Combined conventional soil maps obtained from Second National Soil Survey with terrain data and remote sensing data,using GIS platform and Random Forest Model from R to excavate soil-environment knowledge.Using the Random Forest Mode to disaggregate soil map units in conventional soil map to get a map with finer spatial and attribute information.Major steps of the proposed method were represented as follows: 1)To select the environment attributions which is closely related to the process of soil pedogenesis,including parent material,terrain factors(elevation,slope,aspect,plan curvature,profile curvature and topographic wetness index)extracted from 10 m resolution Digital Elevation Model(DEM)and multi-spectral data in which NDVI(normalized differential vegetation index)、NDWI(normalized difference water index)、FPC(First principal component)、skewness、information entropy、variance and mean extracted.2)Design the sampling points.Used the weighted sampling by area of soil map units and ensure that every unit contains ten points to make a total 6686 points.Extract the environment attributions by the total 6686 points and classify them by parent material.3)Filter the environment attributions.It is necessary to reject some inefficient environment attributions to ensure mapping accuracy.Use the "importance()" function in R to select environment attributions will be helpful.4)Determine the model parameters.M and N are the most two important parameters in the RF(Random Forest)model which can be confirmed by the out-of-bag error and stability of Random Forest model.5)Application of model.Built the model by Random Forest function in R,as a result,we got four different models under four parent material to estimate the soil type in every grid in our study area by the votes of models which made up the hole soil map we need.The results showed that the map disaggregated by Random Forest contained more map units and showed a more detailed spatial information.No mater classification or regression,Random Forest model can perform well which demonstrate that Random Forest model is authentic to acquire soil-environment knowledge,and can provide an efficient method for fine digital soil mapping.Also,as the basis for feature selection,the variables importance measure of Random Forest model can effective reduction for large-scale training sample set and improvement of operation efficiency of this algorithm,guaranteeing the classification precision and provide a reliable basis for future soil units disaggregation of large areas. |