Font Size: a A A

Predicting Species-Level Biomass Using Machine Learning And Zero-Inflated Models In The Great Xing'An Mountains

Posted on:2020-10-09Degree:MasterType:Thesis
Country:ChinaCandidate:W Q WangFull Text:PDF
GTID:2393330596970855Subject:Cartography and Geographic Information System
Abstract/Summary:PDF Full Text Request
Forests play an important role in the global carbon cycle.Forest biomass not only marks the carbon sequestration capacity of forests,but also can assess the carbon balance capacity of forests,and the key to quantifying forest carbon sinks requires accurate estimation of forest biomass,which plays an important role in global carbon cycle research and sustainable development of forests.This study takes the Great Xing'an Mountains as the study area,using Sentinel-1 radar data and Sentinel-2 optical remote sensing data as the main data source,combined with topographic data,meteorological data and forest resources inventory data.exploring when estimating biomass of forest,larch,white birch,aspen,pine and spruce,the performance of Korder nearest neighbor(KNN),support vector machine(SVM),classification regression tree(CART),random forest(RF),stochastic gradient boosting(SGB)five common machine learning models estimating different biomass.The zero-expansion model was used to optimize the biomass estimation model of spruce and Pine,and the optimal model for estimating different tree species was selected.On this basis,the biomass of different tree species in the Great Xing'an Mountains was estimated by the optimal modeling method and the spatial distribution characteristics of biomass were analyzed.The main research contents and conclusions are as follows:(1)Extracting the single-band information of Sentinel-2 optical remote sensing data,and calculating the corresponding vegetation index,extracting the texture characteristics of the backscattering coefficient and backscattering coefficients of Sentinel-1 radar data,supplemented by terrain data and climate data,a total of 44 feature factors were extracted.Based on these 44 features,the relationship between features and biomass is established by five common machine learning methods(Korder nearest neighbor,support vector machine decision tree,random forest,stochastic gradient boosting).Taking the accuracy index of ten times cross-validation as a standard,comparison the accuracy of the five common machine learning methods to estimate the biomass of forest and larch,white birch,aspen,Pine and spruce.Based on this accuracy,the best model for estimating biomass of different tree species is selected.In general,random forests and stochastic gradients boosting work better.When estimating the biomass of the forest,the stochastic gradient boosting model has the highest accuracy.Stochastic gradient boosting is the optimal model for estimating the biomass of larch,aspen,and spruce,and random forest is the best model for estimating white birch and Pine.(2)For spruce and Pine,the sample contains many zero values.When using the regression model of common machine learning to estimate the biomass of these two species,zero-value data is overestimated and non-zero data is underestimated.The zero-expansion model is used to estimate the biomass of spruce and Pine,the model include two steps: classify the target species;then predict the biomass of the target species in the region where the target species are present.In the two steps,the K-order nearest neighbor,support vector machine decision tree,random forest,stochastic gradient boosting classification or regression precision are compared,and the model with the highest classification accuracy and the model with the highest regression accuracy are selected respectively.RF-RF zero expansion model is the best model for estimating spruce.SGB-RF zero expansion model is the best model for estimating Pine.(3)The best model was used to estimate the biomass of different tree species throughout the study area.The forest biomass is 136Mg/ha in the the Great Xing'an Mountains.In the Xilinji,Tuqiang and Amur Forestry Bureau,the biomass value of forest is low.Larch is distributed throughout the Great Xing'an Mountains.White birch is evenly distributed throughout the study area,but the biomass is generally lower than that of larch.Pine is mainly distributed in the northern part of the Greater Xing'an Mountains.Among the five tree species,the spruce has the smallest distribution area and only has a small area distribution in the central part of the forest area.(4)The random forest feature importance assessment method was used to analyze the importance of estimating 44 characteristics of tree species biomass.Among the 44 characteristic variables for estimating tree biomass,temperature and precipitation are more important in estimating biomass of different tree species,and the information entropy of VV also has an important influence.When fitting the biomass of the spruce,the texture characteristics of the radar data are less affected.
Keywords/Search Tags:biomass, Sentinel-1 radar data, Sentinel-2 optical data, machine learning, zero expansion model
PDF Full Text Request
Related items