Near infrared spectroscopy (NIRS) techniques, which has beengradually developing since 1960s,is a novel and rapid quantitative analysismethod. With its advantages of no pretreatment, no pollution, facility,non-destructibility and online measurement, it appeals to numerous fields,including agriculture, food, textiles and pharmacy.Modern near infrared spectroscopy techniques includes three aspectswhich are infrared spectroscopy instruments, stoichiometric software andmathematical models. Fast and accurate analysis requires that the three partspresented above must be organically combined. In the three aspects, themethod of constructing mathematical model is the main research domain. Atpresent, commonly used methods include multiple linear regression (MLR),stepwise regression (SR), principal components regression (PCA), partialleast-squares regression (PLS) and artificial neural networks (ANN), and soon.Although near infrared spectroscopy (NIRS) techniques possess theadvantages of speediness, facility and higher precision, the accuracy may beinfluenced by many kinds of factors. Among of them, Sample granularity anduniformity have the most important effect on the accuracy, becauseGranularity variation directly affects near Infrared spectrum variation. In thepractical operation, test samples should be prepared under the condition ofthe standard sample made. At the same time, the samples hold thestandardized granularity and uniformity so that the error caused bygranularity can be cut down. Except that, the selection and the number of thestandard samples and the design of them can influence the precision of theforecasting. For the purpose of improving Multi-calibrations effect and theapplication scope of the model, the determination of the samples shouldconsider the following factors: sample ingredient content and the physics andchemical property of the standard samples. For example, if the number of thestandard samples is not too enough, the standard samples will not reflect thenormal distribution rule of the tested community;if the number is too many,this will result in increasing the work load. The design of the standardsamples also affects the calibration accuracy. If the relevance of themeasured ingredient regarding the ingredient content is strong, the samesamples can be employed to calibrate. If the relevance is weak, the rest ofsamples can be used to calibrate according to the corresponding selection rule,which can improve the calibration effect and test accuracy. The change oftemperature affects the prediction precision.In this paper one kind of support vector machines (SVMs), least squaresupport vector machines (LSSVR) is applied to determine of alcohol contentin distilled spirits based on NIRS and LSSVR. Vapnik and his colleaguesdeveloped support vector machines (SVMs) which are a new learningalgorithm and based on statistical learning theory. Statistical learning theoryfirst stresses Small sample statistics question. SVMs minimize the bound ongeneration error in term of the principle of structure risk minimization, nottrain error in term of the principle of empirical risk minimization principle.The training algorithm of SVMs is equivalent to solving a quadraticprogramming with linearly constraints. Originally, SVMs have beendeveloped for pattern recognition problems. However, with the introductionof Vapnik's ε -insensitive loss function, SVMs have been extended to solvenon-linear regression estimation problems and exhibited excellentperformance. For pattern recognition problems and function estimationproblems LSSVR simplifies the solution process of standard SVM in a greatextent by substituting the inequality constraints by equality counterparts.Consequently, the problems can be gotten by solving a group of linearequalities rather than quadratic programming. Compares with the standardSVR algorithm, the algorithm complexity was simplified by reducing atuning parameter and some optimization variables. Therefore theconvergence speed can quicken.In our study, the mixtures of water and ethanol were prepared in 55standards with ethanol contents of 0-100.00%. The NIR spectra in the850-1870 nm were measured at ambient temperature with an UV-VIS-NIRspectrometer (UV-3150, SHIMADZU Corporation, Japan), and the scan maygo in 1nm steps increment with a 2.0 nm slit width. The mixture wascontained in a quartz cuvette cell having a path length of 1.0 mm. We tookthe spectrum of the empty cell as a background, and the NIR spectra ofmixtures were measured three times. The mean of NIR spectra of mixturesmeasured three times is regarded as the NIR spectra of the correspondingsample. Different functional groups have different the different frequencymultiplication and gathering frequency characteristic absorption wavelength.According to the NIR spectra of water and ethanol mixture, the optimizedcharacteristic wave band which has the closest relation with functional groupwas selected from the 850-1870 nm region. Then the model based on NIRand LS-SVR was constructed and used to determine the ethanol content. Thispaper examines the importance of the hyperparameters choice inimprovement of algorithm performance. Our proposed model is comparewith the multiple linear regression. Numerical experiment results demonstratethe LS-SVR performs better than MLR in the generalization and thegeneralization precision is improved in evidence. The presented method is anefficient approach to regression estimation. |