| In order to explore the feasibility of data fusion technology for the identification of soybean producing areas in neighboring regions,216 soybean samples were collected from Heilongjiang Agricultural Reclamation Jiusan Administration and its neighboring Suihua to determine the contents of 13 mineral elements such as magnesium(Mg),aluminum(Al)and phosphorus(P)and 5 fatty acids such as palmitic acid,stearic acid and oleic acid.Through principal component analysis(PCA),the information of mineral elements and fatty acids whose eigenvalues are greater than 1 are selected respectively,and the identification features of producing areas of the two data sources are aggregated to form a feature-level fusion data set.Support vector machine(SVM)models with four kernel functions are established by mineral elements,fatty acids,data-level fusion and feature-level fusion data.The parameters of the linear kernel function SVM model with the best performance are optimized by moth-flame optimization(MFO).The discrimination effect of the model based on different data sets before and after optimization is compared,and it is compared with the discrimination model established by linear regression method.The main conclusions are as follows:(1)The two methods of linear regression and SVM can be applied to the identification of soybean producing areas in neighboring regions.Compared with linear regression model,SVM model is more accurate and reliable.(2)MFO algorithm can find the best parameters of linear kernel function SVM.The prediction accuracy of different data recognition technology models optimized by MFO algorithm has been significantly improved,and the accuracy of feature-based data fusion model is the highest,reaching 98.46%.(3)Linear kernel function SVM model based on data fusion.In the aspect of fusion data,the original 13 mineral elements and 5 fatty acid data are extracted into 4 mineral element principal components and 2 fatty acid principal components.Based on the external test set of feature-level fusion data model,the accuracy rate has been increased from 86.15% to 92.31%,and the generalization ability is improved.(4)The comparison results of different data recognition technologies in model accuracy,generalization ability,fitting degree and model construction cost show that the feature-level data fusion technology has a significant effect on the identification of geographical indication product "Jiusan Soybean",which has high prediction accuracy,strong generalization ability,low over-fitting degree and low model construction cost. |