| Terrestrial ecosystem is the basis of human survival and sustainable development.Under the background of global extreme climate events and increasingly frequent human activities,terrestrial ecosystem carbon cycle has been greatly affected.Vegetation plays a very important role in the global carbon cycle by changing phenology in response to climate change and absorbing carbon dioxide(CO2)emitted by human activities through photosynthesis.Therefore,it is crucial to accurately quantify the total amount of carbon fixed by vegetation through photosynthesis,known as gross primary productivity(GPP).Over the past few decades,great progress has been made in quantifying and understanding the temporal and spatial patterns of GPP through flux sites,remote sensing products,and model simulations.Due to the sparsity of site observations,it is difficult to directly estimate terrestrial ecosystem GPP using observational data.With the development of technology,different from the traditional process model and dynamic vegetation model,machine learning model(ML)has been widely used in GPP estimation.Due to the short time span of training dataset(i.e.observation),the current data based on ML method could not reflect the long-term trend of GPP well,and was affected by the difference of growth environment between different plant functional types(PFTs),so there were still large uncertainties in the estimation of GPP.Therefore,it had become a scientific problem how to minimize the error of GPP estimation of terrestrial ecosystem by using the limited observational data of flux sites.In this study,random forest(RF)method was used to classify and estimate different plant functional types(PFTs)by combining site observation and remote sensing data,especially to distinguish carbon 3 crops(CRO_C3)and carbon 4 crops(CRO_C4),so as to improve the estimation accuracy of GPP.We constructed a model using meteorological data and measured GPP values from FLUXNET2015 and China FLUX flux sites,as well as spatial values of LAI for each site.We used this model to estimate 2 GPP datasets:the global ECGC_GPP(0.05 degree,monthly scale)from 1999 to 2019,and the China ECGC_GPP(1 km,daily scale)from 2003 to 2019.We evaluated the accuracy of these datasets and reached the following main conclusions:(1)Compared with the non PFT training model,the accuracy of PFT training model had been effectively improved.In this study,the accuracy of RF training model before and after distinguishing PFT was compared.The R2 of evergreen broad-leaved forest(EBF),mixed forest(MF)and CRO_C4 were all increased by more than 0.06,and the RMSE of CRO_C4 was decreased by 1.46 g C m-2 d-1.In addition to effectively improving the accuracy of cropland GPP training model,the overall R2 of the global and China training models increased by 0.07and 0.05 respectively(optimized by 10.29%and 7.14%,respectively).Although there was a significant difference in the improvement of accuracy between the global flux sites and the China flux sites,the overall R2 of the training model increased 0.07 and 0.04 respectively(optimized by 10.14%and 5.56%,respectively),for the global and China flux sites.By calculating the relative contribution rates of feature variables,it could be found that LAI dominated the RF training models that differentiate PFT(PFT training model)construction process,thus driving GPP to show the change of LAI.Therefore,the PFT training model had a higher generalization and precision optimization range,which avoided the error accumulation caused by the interaction between different PFT data.(2)The estimation of GPP by differentiating PFT corrected the phenomenon of low inter-annual total fluctuation and“high value underestimation”of cropland.In this study,we estimated the global GPP dataset(ECGC_GPP)from 1999 to 2019,with an annual total of117.14±1.51 Pg C yr-1,showing a significant upward trend to 0.21 Pg C yr-2(p<0.01).Except Oceania,the annual total amount of GPP in other continents showed an increasing trend,among which Asia showed the largest increasing trend(0.09 Pg C yr-2).ECGC_GPP followed the latitudinal distribution of the biome,showing two peaks in warm and wet biome(temperate crops and tropical forests).Compared to the 4 GPP datasets(FLUXCOM_GPP,MODIS_GPP,NIRv_GPP,and Revised_EC_LUE_GPP),ECGC_GPP demonstrated favorable spatiotemporal consistency.The interannual variation trend of GPP was increased from 0.01 Pg C yr-2 of FLUXCOM_GPP to 0.21 Pg C yr-2,and 17.53%of the grid points exhibited a trend of more than 10 g C m-2 yr-2 in ECGC_GPP.In ECGC_GPP,the estimated value of cropland grid points increased by 76.38%,and the estimated total amount of cropland increased by 18.68%.In addition,compared with the other 4 GPP datasets,ECGC_GPP R2 value of cropland was the highest(0.55),which effectively improved the widespread phenomenon of“high value underestimation”of cropland GPP.By calculating the uncertainty of ECGC_GPP,it could be found that tropical forests account for the largest amount of GPP and there were relatively few flux sites,which made the estimation of GPP in tropical regions highly uncertain.However,the robustness of GPP estimation was higher in areas with more flux sites.By calculating the spatial uncertainties of 5 GPP datasets,it was found that the estimation of ECGC_GPP had relatively high reliability in most of the biomes.Therefore,compared with FLUXCOM_GPP,ECGC_GPP not only corrected the problem of low inter-annual fluctuation of GPP,but also corrected the phenomenon of“high underestimation”of GPP in cropland.(3)The accuracy of China GPP estimation was improved by distinguishing PFT and spatio-temporal downscaling.The construction of the training model with high spatial and temporal resolution could effectively improve the accuracy of GPP estimation in the China,and the R2 was improved by 0.17(optimized by 42.5%)through the downscaling of spatial and temporal resolution.Therefore,ECGC_GPP had the highest accuracy in the 6 GPP datasets(R2=0.57).During the study period,the monthly average GPP in China showed obvious seasonal variation.The annual mean value of ECGC_GPP in China was 1008.31 g C m-2 yr-1,and the annual total value was 7.49±0.24 Pg C yr-1.The annual mean values of the 6 GPP datasets in China were consistent in spatial distribution,showing a decreasing trend from east to west and from south to north.From 2003 to 2019,ECGC_GPP showed an increasing trend of 0.05 Pg C yr-2,indicating that overall vegetation growth in China was good,and the southern and agricultural areas showed a strong upward trend,while the northwest arid area showed little change.There were significant differences in the annual unit mean of GPP of different PFT in China,among which the annual mean and variation of forest(FOR)were the highest.Therefore,in order to improve the accuracy of GPP estimation in China and accurately analyze its changes,it could be optimized by improving the spatial and temporal resolution. |