Research On Interpolation Method Of Soil Missing Value

Posted on:2022-04-26

Degree:Master

Type:Thesis

Country:China

Candidate:Y F Zhang

Full Text:PDF

GTID:2493306737476504

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Traditional methods of missing value in soil data are mainly limited to the field of soil science.Although the methods are more professional and accurate,they are not sufficiently considered for crossindustry research and utilization.In order to solve this problem,this paper introduces the methods of data mining,taking soil attributes p H and classification as examples to interpolate soil missing value.(1)For the problem of soil attribute p H missing data,this paper compares the interpolation effects of Multiple Regression,KNN,Random Forest,SVM,Neural Network and Multiple Imputation.The main work of this part is as follows:(a)Through a lot of training and testing work on soil dataset,this paper have optimized the best parameters of each method,and established the missing value interpolation model.(b) "#$,%" &$ and %! are used to evaluate the performance of each method on soil dataset with different missing rates.The result shows that,Both KNN and Random Forest with optimal parameters are the least affected by the missing rate of dataset,and the interpolation effects of those two methods are the best.(2)For the problem of soil attribute classification missing data,the main work of this part is as follows:(a)This paper constructs a mathematical model which can describe the missing problem of soil classification attribute,that is,the problem of discrete single attribute data missing(DSADM).(b)A general interpolation algorithm DSADM＿HC for missing value of discrete attribute data based on hierarchical clustering is proposed to solve the problem of DSADM.(3)This paper apply DSADM＿HC to interpolate the missing value of soil classification attribute.DSADM＿HC consists of three parts: hierarchical clustering for missing attributes,cluster classification of discrete attributes,and cluster mapping of missing discrete attributes.On the stage of hierarchical clustering for missing attributes,this paper proposes a dimension number setting index based on dense sampling(DNSDS).DNSDS can assist the dimensionality reduction algorithms to determine the dimension of the dimension-reduced dataset.In the stage of cluster classification of discrete attributes,the cluster selection strategy of the best discrete attribute distribution based on K is proposed select the best cluster division result from the hierarchical tree.In this paper,DSADM＿HC is applied to interpolate the missing values of soil classification attribute.The experiment shows that,(a)DSADM＿HC is effective for missing values of soil classification attribute.(2)Using DNSDS to set the dimension number of dimension-reduced dataset can make DSADM＿HC get the best effect of missing value interpolation of soil classification attribute.(c)Using ’()as cluster distance measurement in DSADM＿HC can obtain the best interpolation effect of missing values of soil classification attributes,and the highest correct rate is 79.2%.

Keywords/Search Tags:

Soil Attribute Data, Missing Data, Data Dimensionality Reduction, Distance Between Clusters, Hierarchical Clustering

PDF Full Text Request

Related items

1	The Research On The Prediction Of Forest Resources With Missing Data
2	Realizing Genomic Prediction Of Phenotypic 1000 Grain Weight In Rapeseed Based On Data Dimensionality Reduction
3	Research On Filling Methods For Missing Data Of Bamboo Germplasm Resources
4	Research Into Responsibility Problems With Elimination Of Poverty By Vocational Education In Rural Areas
5	Research On Big Data Management System For Yellow Tea Growth
6	Marine And Fishery GIS Spatial_Temporal Data Organizing And Analyzing
7	Field Soil Data Clustering Method Based On Fusion Of Feature Distance And Information Entropy
8	Influence Of Missing Data On Accuracy Of Estimate Breeding Value Of Inner Mongolian Cashmere Goat
9	Research On Energy Efficient Clustering Routing And Data Fusion Of Wireless Sensor Networks In Agriculture
10	Study Of The Agricultural Data Mining Platform Based On Classification And Clustering