| In recent years,red tide,brown tide,and other harmful algal blooms have been infiltrating China’s coastal areas,posing a great threat to environmental protection and human health life.Therefore,it is a very important task and research hotspot to establish rapid and real-time information monitoring technology for planktonic algae concentration to achieve routine and emergency detection of harmful algal blooms.The traditional methods of plankton concentration detection have the disadvantages of high requirements of detection expertise,limitations of field detection,and complicated operations.In comparison,three-dimensional(3D)fluorescence spectroscopy excels in good selectivity,high sensitivity,non-destructive samples,and easy implementation of field detection.Therefore,in this paper,based on 3D fluorescence spectroscopy technique,combined with Zernike moments and ensemble learning,we conducted studies on the prediction of algal concentration of the casusative pelagophyte of brown tide and quantitative analysis of the algal concentrations of mixed algal,the main research is as follows:Firstly,the relationship between microalgae concentration and fluorescence intensity was deduced based on the principle of algal in vivo fluorescence,combined with LambertBeer law,and the feasibility of algal concentration prediction based on fluorescence spectroscopy was analyzed;the main casuative algae of brown tide,Aureococcus anophagefferens(A.anophagefferens),and two comparison algal species,including Chlorella and Synechococcus elongatus,were selected as the experimental algal species,which were inoculated and cultured in the laboratory environment,and algal fluorescence spectral data and concentration values of algae were collected by FLS920 fluorescence spectrometer and microscope counting,respectively;zero-setting method and Delaunay triangular interpolation method were used to eliminate scattering interference,and Savitzky-Golay(SG)polynomial fitting method was applied to smooth the noise in the original spectra;then the abnormal spectral samples,caused by uncertainties,were removed by combining isolated forest with cumulative similarity scores.Secondly,to address the problems of algal species concentration prediction and spectral feature extraction,a method was proposed to extract the features of the original3 D fluorescence spectral data with Zernike moments and construct prediction models of brown tide algae concentration with ensemble learning algorithms.The 36 Zernike moments of the 3D fluorescence spectra of A.anophagefferens were determined and extracted by image reconstruction,forming the feature matrix Combo 1.For the overfitting problem of the ensemble learning model and the multi-resolution characteristics of Zernike moments,the Boruta Shap feature selection algorithm was introduced to construct three feature selection models including Boruta Shap_RF,Boruta Shap_GBDT and Boruta Shap_XGBoost,then three feature subsets Combo 2,Combo 3 and Combo 4 were obtained,which were combined with the regression models Random Forest,Gradient Boosting Decision Tree,and Extreme Gradient Boost Tree to predict the concentration of A.anophagefferens.Among them,the feature selection model Boruta Shap_GBDT has optimal feature screening properties,which makes each regression model achieve optimal performance,in the test set,the decision coefficient and absolute mean percentage errors of the best regression model are 0.9376 and 0.0687,respectively.Finally,for the 3D fluorescence spectral data of 135 mixed algal samples with different concentration ratios,the optimal value of the maximum order of Zernike moments as 9 was determined by image reconstruction.The extracted Zernike moments combined with Partial Least Squares and Gradient Boosting Decision Tree multiple regression algorithms were applied to establish prediction models for the concentrations of the component algae of the mixed sample,respectively.We compared the results of the two models,the coefficients of determination on the test sets of the models based on the Gradient Boosting Decision Tree were all greater than 0.94,the mean absolute percentage errors and the mean absolute errors were all less than 0.16,and 0.05,respectively,for the component algae A.anophagefferens,Chlorella and Synechococcus elongatus. |