Font Size: a A A

Research On Flowering Prediction Based On Machine Learning

Posted on:2021-11-18Degree:MasterType:Thesis
Country:ChinaCandidate:X W ZhangFull Text:PDF
GTID:2510306725952289Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Flowering forecast is an important part in the construction of agro-meteorological index system and an important part of the meteorological service system.The prediction of flowering is not only a regression prediction problem,but also a time series prediction problem.At present,the prediction model for flowering is based on the establishment of a regression model.In the research process of regression prediction models,most scholars' research only can improve the accuracy of the prediction model at the algorithm stage.There are few people who study to improve the accuracy of the model by optimizing data quality.Secondly,in the regression prediction process,since the actual result is in the form of an integer in days,and the predicted result is in the form of a decimal,it is inevitable that the problem of decimal rounding will occur.Finally,there is a long blank period for the regression model to achieve early prediction.During this period,low-temperature cold damage and other meteorological disasters will affect the prediction results.In this paper,the growth process of local "Red Fuji" apples in Ji County,Linfen City,Shanxi Province is used as research material.With the help of machine learning and deep learning algorithms,static and dynamic flowering prediction models suitable for Ji County are proposed.The static prediction model is an improvement of the existing model,which is to use the multiple linear regression model as a data quality filter,and optimize the regression model by integrating learning ideas.The static prediction model can realize flowering forecast at least 21 days in advance.The dynamic prediction model is to combine the multi-variable LSTM network with the ensemble learning binary classification task to complete the prediction of the next three days of flowering on each day from March 25 to April 30 every year.The research results are as follows:1.Pearson correlation coefficient of time interval and meteorological factorsPearson correlation analysis was carried out on meteorological factors and time intervals in 12 different periods from September to December in the upper,middle and lower ten days.We can find that there is no high correlation between each meteorological factors and time intervals,only low to medium correlation.The number one correlation between September 1st and October 1st is 10 cm ground temperature,the number one correlation between October 11 th and December 11 th is precipitation,and the number one correlation on December 21 st is the number of sunshine hours.2.Static prediction modelFour-feature vector and five-feature vector multiple linear regression models were established for the data of 12 different time periods,so as to determine the best prediction time period and the optimal number of feature vectors.Afterwards,the model was improved by the idea of PCA dimension reduction and ensemble learning.The multiple linear regression model determines that the best prediction period starts from October 1 and ends on March 15 of the following year.The number of optimal feature vectors is 4,and the coefficient of determination of the model is-1.1583,root mean square error is 6.3473 and the explanatory variance fraction is-1.0022.For the RFR,SGD weighted average model and RFR,SVR weighted average model using integrated learning ideas after PCA dimensionality reduction,the determination coefficients are 0.3651 and 0.3763,the root mean square error is 3.4425 and 3.4121,and the explanatory variance fraction is 0.3763 and 0.3763.3.Dynamic prediction modelThrough periodic analysis of meteorological factors,find meteorological factors with obvious periodicity and add two features of air temperature and ground temperature,and the new features also meet the periodicity.Multivariable LSTM network which uses the data of the previous 30 days to predict the data of the next 3days.After 100 iterations,the mean square error of the model is less than 0.05,and the prediction results are evaluated by rolling prediction.The root mean square error of the model is less than 0.6.Finally,the idea of combination strategy is used to perform arithmetic average judgment on the prediction results of the two classification task learners of RF and Ada Boost to determine whether it is a flowering period.The AUC value of the model is 0.82,while the AUC values of individual learners are 0.81 and 0.80 respectively.
Keywords/Search Tags:Multiple linear regression, Ensemble learning, Regression task, Multivariable LSTM, Classification task
PDF Full Text Request
Related items