| With the erosion of soil caused by precipitation and runoff,regions with high nitrogen loss in soil have become the "source" of non-point source pollution.Determining the location of the "source" of non-point source pollution requires a clear understanding of the distribution of soil nitrogen content in the study area.Traditional laboratory chemical analysis methods have some limitations in quickly analyzing the total nitrogen content of soil when there are large numbers of samples.However,different soils have their own unique spectral reflection characteristics,and using hyperspectral technology for rapid monitoring of soil total nitrogen content has significant theoretical and practical significance.In this study,383 sampling points were laid out using the grid distribution method in the Daye Lake basin as the study area,and the spectral data information and total nitrogen value content of the collected samples were determined.After smoothing the spectral data,the spectral transformations such as continuous statistical removal,standard normal transformation,first order derivative,second order derivative,square root and its first and second order derivative,inverse and its first and second order derivative,logarithm and its first and second order derivative,and inverse logarithm and its first and second order derivative were performed.The transformed spectral reflectance was constructed two by two with three spectral indices of difference,ratio and normalization,and correlation analysis was performed with the total soil nitrogen to find the band with the strongest correlation.The transformed spectral data and the constructed soil spectral indices were correlated with the soil total nitrogen content,respectively,and three machine learning regression methods,namely Random Forest,Support Vector Machine and BP Neural Network,were used to construct hyperspectral inversion models for the whole sample points of Daye Lake watershed as well as forest land and farmland soil total nitrogen values,and the accuracy of the constructed models was compared to select the optimal inversion model.The main conclusions of this study are as follows:(1)Total nitrogen values in Daye Lake watershed are distributed higher in the soils of the northern,central-eastern,northwestern and southeastern parts of the watershed,and lower in the low mountain areas in the north and south.The mean value of total nitrogen content in forested soils is the highest and the mean value of total nitrogen content in bare land soils is the lowest,and the different land use types lead to the differences in total nitrogen content in soils.(2)Most of the mathematically transformed spectra showed peaks at 1400 nm,1900 nm and 2200 nm.After the mathematical transformation,the spectral data were compressed to some extent,which improved the consistency and comparability of the data in general.The modeling accuracy of the models constructed by using the first-order derivative,logarithm,inverse,square root,inverse logarithm,continuous statistical removal,and standard normal transformation is good,and the accuracy of the models constructed by the spectral transformation is effectively improved compared with the direct modeling.(3)The difference,ratio and normalized spectral indices were constructed for any two spectral bands after the transformation,and the correlations between different spectral indices and total nitrogen were compared.The correlation coefficient was 0.618.The accuracy of the spectral index model for the whole sample site and the forest sample site was slightly lower than that of the model constructed after spectral transformation.The prediction accuracy of the models constructed with the spectral indices was improved to some extent.Overall,the modeling accuracy of soil spectral data constructed by normalized spectral indices is better than that of the models constructed by the other two spectral index construction methods.(4)The best prediction model for the whole sample point of Daye Lake basin is the inverse log-BPNN model with the training set R2 of 0.536 and RMSE value of 2.074,and the prediction set R2 of 0.623 and RMSE value of 2.601;the best prediction model for the total nitrogen of forest land is the continuous statistical removal-BPNN model with the training set R2 of 0.681 and RMSE value of 1.974.The best prediction model for total nitrogen in agricultural soils is the log-NDSI-RF model,with a training set R2 of 0.830,RMSE value of 1.093,and prediction set R2 of 0.707 and RMSE value of 2.574.Overall,in the inversion of total nitrogen in the Daye Lake watershed,the models constructed using the BPNN,RF The model constructed by the method has a good prediction effect on the whole sample points,forest sample points and farmland sample points. |