Font Size: a A A

Analysis Of Unbalanced Grain Loss Data Based On RockSmote-Rf

Posted on:2020-11-24Degree:MasterType:Thesis
Country:ChinaCandidate:X L LiuFull Text:PDF
GTID:2428330578483459Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The losses and waste of grain is closely related to China's primary strategy of reducing grain loss and the industrial performance of the national grain circulation.Reducing losses and waste has always been a critical task of China's grain circulation industry.At present,the grain industry has accumulated sufficient data in the post-harvest loss link.How to effectively analyze these loss data is worth exploring for the study of grain loss assessment methods and providing valuable information for grain loss reduction.The grain loss data has many attributes and is unbalanced.Compared with the decision tree and other algorithms,random forest algorithm has the characteristics of fast training speed and high accuracy in the processing of multi-attribute data.It is widely used in analysis and study.However,the algorithm also shows some drawbacks in the study of multi-attribute unbalanced data,and there is still room for experimentation and optimization.To better solve the problem of analysis and analysis of unbalanced data of grain loss,this thesis first proposes an optimization method for unbalanced data of random forest processing based on relevant research,and verifies its feasibility on the standard data set,and then the grain loss imbalance data was analyzed and studied in detail,and finally the extended analysis established a grain loss prediction model.The work of this thesis has the following aspects:(1)Firstly,through the grain loss questionnaire system,the related data of grain loss project are collected and stored.The analysis shows that the grain loss severity is unevenly distributed,and the grain loss data has an unbalanced nature.Then combined with the random forest algorithm to deal with the unbalanced data processing,this thesis summarizes the related work,and prepares for the subsequent algorithm optimization and grain data analysis and research;(2)Secondly,to better solve the problem of unbalanced data of grain loss and consider the effect of balancing data sets to optimize the unbalanced data of random forest processing,this thesis proposes a RockSmote Random forest algorithm with optimized oversampling.The idea of RockSmote-Rf algorithm first to cluster the small class data in the unbalanced data,and oversample the new sample based on the modified formula to balance the data set.On this basis,the classification of the random forest algorithm is carried out.The five sets of standard data of UCI are selected through experiments to verify the feasibility of the algorithm,and the accuracy of unique random forest algorithm,unique decision tree algorithm and SMOTE random forest algorithm classification was compared on each data set.The experimental results proved the effectiveness of the proposed algorithm;(3)Finally,a detailed analysis of the data obtained from the post-harvest loss project was conducted.On the one hand,the random forest algorithm based on optimized unbalanced data processing is used to evaluate the degree of grain loss qualitatively,to understand the degree of post-harvest loss of grain,to obtain the essential influencing factors of post-harvest loss of grain.The thesis compares the grain post-harvest loss assessment method proposed in this thesis with the analysis methods used by scholars,and obtains the superiority of the grain loss analysis method described in this thesis.On the other hand,based on the qualitative loss assessment,the external information factors of the farmers are integrated to establish a quantitative prediction model of grain loss.This thesis also constructs a grain loss system to visualize the storage and analysis methods of grain loss data.
Keywords/Search Tags:Post-harvest loss of Grain, Random forest, Unbalanced data, SMOTE
PDF Full Text Request
Related items