| Soil is an important carbon pool,which can have a great impact on the emission of global greenhouse gas through the exchange with the carbon dioxide in atmosphere.The carbon in the soil is about three times that in the atmosphere,and mainly in the form of organic carbon,namely soil organic carbon(SOC).SOC is a key indicator for evaluating soil quality,and an important parameter for agricultural soil.Therefore,regular monitoring to realize the spatial distribution mapping of SOC content is of great significance.The traditional method that measuring sample points is time-consuming and costly,and laboratory-based soil reflectance spectroscopy can only provide spectral data over a limited area,not suitable for large-scale monitoring.However,the remote sensing data are easy to obtain,covering a wide range,and can be used for ground monitoring throughout the year,so they have outstanding advantages in SOC prediction.In particular,Sentinel-1 and Sentinel-2 satellites with short revisit time,high spatial resolution and physical calibration data of high quality provide a large number of images for accurate SOC prediction.This study takes the area north of the Yangtze River in Wanzhou District,Chongqing as the study area.A total of 565 soil surface samples were collected from orchard,dry land,and paddy field.Synthetic aperture radar satellite images(Sentinel-1),optical satellite images(Sentinel-2)and digital elevation model(DEM)data were used as input variables,the gradient boosting regression tree(GBRT),support vector regression(SVR),random forest(RF)and extreme gradient boosting(XGboost)were used to establish the prediction models of SOC content with different combinations of feature variables.Moreover,determination coefficient(R~2),mean absolute error(MAE)and root mean square error(RMSE)were used to evaluate the prediction accuracy,and then the optimal prediction method and feature combination were selected to predict the SOC content.The purpose is to explore the synergy between radar and optical images in SOC prediction,and reveal the importance of remote sensing data in predicting SOC content under different land use types.(1)The prediction variables with remote sensing data were extracted from Sentinel-1 and Sentinel-2 data from 2020 to 2021.Including two kinds of backscattering coefficients from Sentinel-1 data:vertical-vertical(VV,VV_1~VV_12)and vertical-horizontal(VH,VH_1~VH_12)polarization backscattering coefficients,ten spectral bands(B2,B3,B4,B5,B6,B7,B8,B8A,B11,and B12),four soil radiometric indices,and twenty vegetation radiometric indices from Sentinel-2 data.Soil radiometric indices includes:brightness index(BI),second brightness index(BI2),redness index(RI)and color index(CI);vegetation radiometric indices includes:soil adjusted vegetation index(SAVI),transformed soil adjusted vegetation index(TSAVI),modified soil adjusted vegetation index(MSAVI),second modified soil adjusted vegetation index(MSAVI2),difference vegetation index(DVI),ratio vegetation index(RVI),rerpendicular vegetation index(PVI),infrared percentage vegetation index(IPVI),weighted difference vegetation index(WDVI),transformed normalized difference vegetation index(TNDVI),green normalized difference vegetation index(GNDVI),global environmental monitoring index(GEMI),atmospherically resistant vegetation index(ARVI),normalized difference index(NDI45),meris terrestrial chlorophyll index(MTCI),modified chlorophyll absorption ratio index(MCARI),red-edge inflection point index(REIP),inverted red-edge chlorophyll index(IRECI),pigment specific simple ratio algorithm(PSSRa)and normalized difference vegetation index(NDVI).In addition,three topographic factors were extracted from the 30 m DEM data:elevation,slope and aspect.Based on this,prediction models were developed,Model A:Sentinel-1+Sentinel-2(spring)+DEM,Model B:Sentinel-1+Sentinel-2(summer)+DEM,Model C:Sentinel-1+Sentinel-2(autumn)+DEM,and Model D:Sentinel-1+Sentinel-2(winter)+DEM.(2)The results of SOC prediction based on different machine learning methods showed that the XGboost method obtained the highest accuracy for orchard,dry land and paddy field.In orchard,the ranking of the R~2 was XGboost>RF>GBRT>SVR.In dry land,the ranking of R~2 was XGboost>GBRT>SVR>RF.In paddy field,the ranking of R~2 was XGboost>GBRT>SVR>RF.(3)The results of SOC prediction based on the optimal prediction method(XGboost)and different prediction models(Model A,Model B,Model C and Model D)showed that,Model A obtained the highest prediction accuracy for orchard,dry land and paddy field.In orchard,the ranking of R~2 was Model A>Model C=Model D>Model B;In dry land,the ranking of R~2 was Model A>Model B>Model D>Model C;In paddy field,the ranking of R~2 was Model A>Model B>Model D>Model C.The results proved that the acquisition time of optical satellite images had a significant effect on the SOC prediction with remote sensing data,and Sentinel-2 data in spring were effective predictor variables for SOC content.(4)The prediction results of SOC content under different land use types based on the optimal method(XGboost)and model(Model A)showed that:the model obtained the highest prediction accuracy for orchard,and average for dry land and paddy field.The values of R~2,MAE,and RMSE for SOC prediction in orchard were 0.78,0.10%,and0.13%,respectively,and the variables with high importance scores were elevation(DEM),VV_4(Sentinel-1),VH_9(Sentinel-1),and VH_5(Sentinel-1),indicating that the Sentinel-1 and DEM data contributed more to the model than Sentinel-2 data.The values of R~2,MAE and RMSE for SOC prediction in dry land were 0.49,0.16%and 0.19%,respectively,and the key predictor variables were SAVI(Sentienl-2),VH_4(Sentienl-1),VV_5(Sentienl-1)and B11(Sentienl-2),indicating that Sentienl-2 data contributed more to the prediction model for dry land than Sentienl-1 data.The values of R~2,MAE,and RMSE in paddy field were 0.59,0.17%,and 0.21%,respectively,and the variables that contributed more to the model were VH_4(Sentienl-1),MTCI(Sentienl-2),VV_2(Sentienl-1),and BI(Sentienl-2),indicating that the Sentinel-1 data were more important than Sentinel-2 data for SOC prediction in paddy fileld.(5)Mapping the spatial distribution of SOC content in the study area based on XGboost and Model A.The mean value of SOC content in orchard is 0.73%,and the content ranges from 0.39%to 1.73%;The mean value of SOC content in dry land is 0.74%,and the content ranges from 0.29%to 1.42%;The mean value of SOC content in paddy field is 0.92%,and the content ranges from 0.44%to 1.52%.The spatial distribution of SOC content shows that,the SOC content in orchard,dry land and paddy field all have obvious characteristics of aggregation with high and low values. |