| Maize,as the largest grain crop planted in our country,plays a pivotal role in ensuring national food security.The yield and quality of maize are closely related to the variety.Traditional varieties identification methods had the disadvantages of time-consuming,damaging and complicated operation process.In view of the shortcomings of current research methods,hyperspectral imaging technology combined with machine learning algorithm was used to study the variety identification of six varieties of maize seeds.The specific research contents are as follows:(1)The hyperspectral data information of six maize seeds was collected.Three methods of multiple scattering correction(MSC),Savitzky-Golay and first derivative(SG1)and de-trending(Det)were used to preprocess the original spectral data.Four types of varieties identification model based on linear discriminant analysis(LDA),support vector machine(SVM),nearest neighbor(KNN)and decision tree(DT)were constructed based on the original data and three kinds of preprocessed spectral data.The results showed that MSC had the best data preprocessing effect and LDA had the highest accuracy.Feature bands were extracted from the data,which were preprocessed by MSC,by three methods,and the three methods were competitive adaptive reweighted sampling(CARS),successive projection algorithm(SPA)and iteratively retains informative variables(IRIV).The varieties varieties identification models of LDA,SVM,KNN and DT were constructed based on the feature bands extracted by the three methods.The results showed that the bands extracted by IRIV could better represent the whole band data,and was conducive to the improvement of the accuracy of the varieties varieties identification,and the accuracy of LDA was the highest.Finally,the experimental results showed that MSC-IRIV-LDA model had the best variety varieties identification effect,the accuracy of prediction set and Kappa coefficient were 0.9333 and 0.9186.(2)The structure of LDA was optimized by bayesian optimization algorithm.With Gaussian process as the probabilistic surrogate model and EIP as the collection function,the Delta and Gamma hyper-parameters of LDA are optimized by iterative method.The discrimination results of the optimized LDA model were compared with those of the LDA model without hyperparameter optimization.The results showed that with the characteristic bands extracted by IRIV as the model input,the prediction set accuracy of the optimized LDA model was increased by 1.19% and the Kappa coefficient was increased by 1.48% compared with the unoptimized LDA.(3)A random subspace integrated learning model(RSEL)using LDAB as individual learner was established to improve the efficiency of seed varieties identification.The subspace feature dimension of RSEL and the number of individual learners were selected,and the combination of the two parameters was determined.A full-band RSEL based on MSC pretreated data and a feature band RSEL extracted by IRIV and CARS method were established respectively,and the modeling results of LDA,LDAB and RSEL were compared.The results show that RSEL models based on different spectral data can maintain high accuracy and stability,the lowest accuracy is 0.9222,the highest accuracy is 0.9556The method proposed in this study can achieve non-destructive and effective identification of maize seed varieties,and at the same time provide a reference and new ideas for non-destructive identification of other crop seed varieties. |