| Camellia oleifera Abel seed oil(camellia oil for short)is a unique high oleic acid vegetable oil in China,and its adulteration is relatively common,which can be divided into three categories:common vegetable oils,high oleic acid vegetable oils and low grade camellia oil.Adulteration with common vegetable oil can be identified according to the content of oleic acid and the characteristic components such as sterols,but there is still a lack of systematicness and integrity at present;Adulteration with high oleic acid vegetable oil and low grade camellia oil is the most common,and there is still a lack of effective identification methods.Data fusion can effectively overcome the shortcomings of existing single component detection methods,such as large error and low sensitivity.Combined with chemometrics,key information can be quickly screened from complex data and explained through mathematical models for better recognition.For this reason,using five components fifty-four indicators of oil,our paper established chemometrics models such as principal component analysis,cluster analysis,soft independent modeling of class analogy,partial least squares discriminant analysis,orthogonal partial least squares discriminant analysis(OPLS-DA),random forest method,support vector machine(SVM)and their regression analysis,realizing the adulteration identification of binary camellia oil with adulterated ratio above 5%(w/w).The main research contents are as follows:1.Aiming at the qualitative identification of camellia oil adulterated with 8 common vegetable oils,using 29 key indicators including fatty acids,triglycerides,tocopherols,squalene and sterols,binary and multiclass classification models were constructed.The results showed that the unsupervised model established by four kinds of single components were feasible for pure oil samples differentiation,while it was difficult to classify adulterated oil samples for their large overlap;According to the combination comparison of two single components(fatty acids/triglycerides)and five binary-classification supervised models,SVM was found with the best recognition,for its total accuracy≥95.19%,thus accurately distinguishing camellia oil and adulterated samples(adulteration ratio≥5%);In order to further predict adulterated oil type,with the combination and comparison of input variables(two kinds of single components,six kinds of fusion indicators)and multi-classification models,SVM present the best prediction performance whatever different variables,using SVM model,the adulterated oil type could be 100%accurately identified with their adulterated ratio≥10%;While the adulterated oil type at adulteration ratio<10%,could be further classified using data fusion,where triglycerides+tocopherols showed the best complement,followed by fatty acids+tocopherols+sterols,with the 98.08%and 95.19%prediction accuracy respectively.Additionally,when using multi-class SVM model to predict low,medium and high adulterated oil samples,a single component could only identify high concentration adulterated oil samples,while the remaining low and medium concentration adulterated oil samples could be further classified using triglyceride+tocopherol/sterol,with accuracy≥96.15%.2.Aiming at the qualitative identification of camellia oil adulterated with high oleic acid vegetable oil,using 22 key indicators including fatty acids,triglycerides,tocopherols,squalene and sterols,binary and multiclass classification models were constructed.The results showed that camellia oil,refined olive oil and high-oleic sunflower oil could be devided into three independent subsets,with the unsupervised models established by four kinds of single components,while large overlap was observed for low concentration adulterated samples;Camellia oil and adulterated oil samples could be 100%classified,and the prediction accuracy were 98.99%and 96.55%respectively using fatty acids and triglycerides;To further identify adulterated oil type,the multi-classification OPLS-DA/SVM models were developed with different input variables,where SVM present the best discrimination,and the decision limits of single fatty acids and triglycerides components were 10%and 20%respectively;Then the decision limit could be reduced to 5%with fusion indicators,where fatty acids+squalene and sterols,triglycerides+tocopherols showed the best recognition for their100%and 96.88%of prediction accuracy,respectively.In addition,when using SVM model to predict low,medium and high adulterated oil samples,all samples could be 100%identified using fatty acids;Using triglycerides,only high concentration adulterated oil samples could be identified,while the remaining low and medium concentration adulterated samples could be further accurately classified with 96.88%of accuracy,on the basis of triglycerides+tocopherols+squalene and sterols-feature layer fusion.3.In order to predict the adulteration ratio of common vegetable oils and high oleic acid vegetable oils,with their corresponding indicators from adulteration oil types identification,the regression models PCR,PLS,OPLS as well as SVR were established,and the effects of latent variables,pretreatment methods,variables number and types on the model were investigated,also the qualitative and post-quantitative models effect were compared.The results showed that using PLS,with single fatty acids/triglycerides,the optimal modeling parameters were determined,including latern variables(2~6),scaling method(UV/Par/Ctr),key variables(less than 5),and the effect of re-quantitative after qualitative analysis was found to be significantly better than that of direct quantitative analysis.On this basis,among four regression models,SVR and OPLS showed better quantitative effect,with their optimal models,camellia oil adulterated with refined olive oil(20%~100%)and the other nine adulterated oils(5%~100%)could be accurately predicted by a single component,as the correlation coefficient R~2≥0.9939,and the root-mean-square error of the prediction≤3.4344%;For the adulteration of refined olive oil with<20%content,the fusion index(fatty acid+squalene and sterol,triglyceride+tocopherol)can be used to accurately quantify.In addition,20 commercially available blind samples were screened out using the established models,and 6 samples were found to be camellia oil adulterated with corn oil,rapeseed oil and sunflower oil,at the 6%~40%of adulterated ratio.4.Aiming at the adulterated identification of the first and second grade camellia oil,using fatty acids,triglycerides and volatile composition characteristics from different grades oil samples,the qualitative and quantitative models were developed with above three kinds of single components as variables.The results showed that compared with fatty acids and triglycerides,the difference of volatile components in different grades camellia oils was more obvious,thus obtained better clustering for pure oil samples,where 15 characteristic indicators were selected through VIP;Among the four discrimination models constructed based on this,OPLS-DA obtained the best recognition for low-grade adulterated samples,with 92.06%of total accuracy;To predict the adulteration ratio,through the combination comparison of different pretreatment methods and regression models,the highest quantitative accuracy was obtained when Unit variance scaling combined with OPLS,with the linear coefficient above 0.9993,and root mean square error of prediction set close to 0.In summary,this paper constructed a multi-index identification method based on chemometrics for three kinds of camellia oil adulteration scenarios,and realized the binary camellia oil adulteration identification at adulterated ratio above 5%(w/w),providing a new idea for camellia oil authentication. |