Font Size: a A A

Filling Method For Missing Data In Forest Resource Sampling Inventory

Posted on:2020-02-29Degree:MasterType:Thesis
Country:ChinaCandidate:F LiuFull Text:PDF
GTID:2370330626451169Subject:Forest management
Abstract/Summary:PDF Full Text Request
Many factors and big data were needed in the process of forest inventory.In the actual survey,the collected data are often incomplete due to the complex geographical location,unattainable location of sample plots,or other reasons.It is necessary to study the filling methods of missing data to improve the accuracy of data analysis and solve the data missing problem in forest inventory.The National Forest Continuous Inventory data and Landsat-8 OLI remote sensing image in 2014 were used as the main data source in Chenzhou City,Hunan Province.Based on the spatial autocorrelation analysis and semi-variance analysis of the average Diameter at Breast Height(DBH)in the sample plot,the spatial filling methods,non-spatial filling methods,and remote sensing-based filling methods were developed and compared for filling missing average DBH.Ten folds cross-validation was employed to evaluate the accuracy of each method to select the best filling method for missing data.Therefore,it can provide a reliable data basis for forest inventory and statistics.The results show that:(1)The average DBH of plots were analyzed by geostatistical analysis.The distribution of average DBH showed a significant spatial autocorrelation and a spatial clustered phenomenon of high values for the global Moran's index was 0.114 and the standardized Z value was 5.334.A few regions showed significant negative correlations,that is high values surrounded low values or low values surrounded high values,in which low values generally showed scattered.The average DBH of sample plots in the study area also had spatial heterogeneity.Semi-variance analysis showed that the spatial heterogeneity of average DBH was obvious,and there was a certain spatial autocorrelation in the range of 20.7 km.The spatial pattern of average DBH based on the grid of sample plot showed moderately or strongly spatial autocorrelation.The variability is mainly caused by structural factors such as climate and topography,while the influences of random factors such as weeding,fertilization,and intermediate cutting are weak.And this variation can be better fitted by the exponential model from semi-variance theory model.(2)Comparing the two non-spatial filling methods,the filling accuracy of EM algorithm is higher than the simple regression algorithm.The overall effect of data filling is poor by these two algorithms,because they were non-spatial filling methods which based on traditional statistical principles and ignore the spatial distribution characteristics of variables.The spatial filling methods based on geostatistics analysis were developed.Among them,Kriging interpolation is the highest spatial interpolation methods with the highest accuracy of 0.46,because Kriging interpolation fully considers the geometric characteristics of geostatistics such as the relationship between sampling plots,spatial location and spatial distribution.In the Kriging interpolation,the exponential function had the highest interpolation accuracy,which is consistent with the model from semi-variogram theoretical function when the variations of average DBH were analyzed.The accuracy of spline function method is the lowest followed by IDW method.All the spatial filling methods which considered the spatial characteristic of data were significantly better thanthe non-spatial filling methods.Therefore,for the data with spatial characteristics,the filling methods of missing data must consider the spatial distribution characteristics of variables,thus the accuracy of the filling method can be greatly improved.(3)Among the five remote-sensing-based filling models,random forest(RF)is the best with a filling accuracy of 0.76 followed by K-Nearest Neighbor(KNN),Bagging algorithm,Artificial Neural Network,and Multiple Linear Regression.RF consist of many decision tree classifiers which work simultaneously in the modelling procedure.Under these circumstances,the 6 environmental factors can be fully used by RF.The KNN can better reflect the interactions among adjacent pixels.Only elevation and humidity in soil and vegetation were important for the average DBH of the forest in the study area among the six environmental variables.(4)Comparing the most accuracy spatial filling model(Kriging interpolation)with the most accuracy remote sensing-based filling model(RF),the precision of RF is higher and the error is smaller than Kriging interpolation.The spatial distribution map of average DBH in Chenzhou City obtained by RF is more precise and identifiable.The map of average DBH in Chenzhou City show a spatial pattern of lower in the west and higher in the east,which better consists with that of the sample plots,indicating that RF can accurately fill the missing data.(5)There is a high consistency between the spatial distribution pattern of average DBH obtained by RF and those of elevation and water resources in Chenzhou City,indicating that elevation and hydrology are important factors which affect the DBH.According to the RF map of average DBH in Chenzhou City,Guidong County has the largest average DBH,followed by Rucheng County and Zixing City,while the areas with smaller average DBH are concentrated in Jiahe County and Yizhang County in the west of Hunan province.The area trends that the average DBH range occupied were consistent with the area trends of average DBH,that is the area of low average DBH is large and the area of high average DBH is small.The results showed that the area of small-diameter timber forests is larger than that of large-diameter timber forests.It can be resulted that there are significant spatial autocorrelation and spatial heterogeneity for the distribution of average DBH in the process of forest inventory in Chenzhou City.For the phenomenon of data missing of average DBH,Kriging interpolation and remote sensing-based RF model can fill the missing data effectively.Among them,machine learning algorithm of random forest is the best filling method to filling missing data in this study.
Keywords/Search Tags:Forest resource sampling investigation, Average DBH, Missing Data, Filling methods, Chengzhou
PDF Full Text Request
Related items